XC code examples

Contents

Started 2Feb2018, updated 13June2018

This page is in group Technology and is a blog note with some XMOS XC code examples. I also have some ported code that I thought may be placed here. I also am describing some problems.  None of this is on GitHub. I guess that the parent note is XC is C plus X. Also see an overview at “My XMOS notes”, chapter MY OTHER NOTES WHERE XMOS MATTERS.

Fold handling

This blog note uses the Collapse-O-Matic WordPress plugin to handle text folding. In addition to expanding and closing them individually you may here:

Expand All
Collapse All

Some musts

In my opinion, bool needs to be defined.

bool

No C99

Just add this in some a global.h file (or similar name):

typedef enum {false,true} bool; // 0,1

A wild example:

bool my_bool1 = true;
bool my_bool2 = false;
   
if (my_bool1) {
    my_bool2 = false;
} if (my_bool2 == false) {
    my_bool1 = true;
} else {
    my_bool1 = !my_bool2;
    my_bool2 = false;
}

(It doesn’t look like there is any problem in having the -std=c99 defined also in this case. See below:)

Compiling C as C99 but still no _Bool

C99 has a built-in type called (read about it in Wikipediahere).

To activate C99 just add -std=c99 to XCC_FLAGS in the Makefile in xTIMEcomposer (read about it in the xTIMEcomposer user guide, from here). Then add:

#include <stdbool.h>

You must remove the bool type definition above. The stdbool.h comes with the xTIMEcomposer installation, and here is the contents:

// Copyright (c) 2008 Eli Friedman
/* Don't define bool, true, and false in C++, except as a GNU extension. */
#ifndef __cplusplus
#define bool _Bool
#define true 1
#define false 0
#elif defined(__GNUC__) && !defined(__STRICT_ANSI__)
/* Define _Bool, bool, false, true as a GNU extension. */
#define _Bool bool
#define bool  bool
#define false false
#define true  true
#endif

#define __bool_true_false_are_defined 1

I am surprised to see that true and false are defined with defines, and not as part of a set. I remember some ten years ago that we cooperated with a company that used the Motorola HC08; the compiler they used didn’t support enums. Maybe that’s the reason? Friedman has commented “GNU extension” in the code, maybe the answer lies there? I see that -std=gnu99 also is possible for XCC, but I get the same result.

C99 points to 1999, quite some years ago, and it’s now obsoleted for C11. There is no option for C11 in the XCC compiler.

Adding the file ref and the flags should make sense. But it doesn’t. I get this:

../src/__arduino_on_xmos_test.xc:193:5: error: use of undeclared identifier `_Bool'

This is unresolved. But I’ll query at xCore Exchange. Update: I didn’t have to, there already was a post that I just added to, see Wot no bool? started by CousinItt. The answer turned out to be that _Bool is available in ‘C99’ but not in ‘xC’, thus the header file <stdbool.h> won’t work with ‘xC’. 

A general problem with testing against “true”

In the “Wot no bool” post referred to above robertxmos points out a case where

#define FALSE 0
#define TRUE !FALSE
// Some code
if (hasItem(i) == TRUE) { /* 32bit comparison is only true if index was '0' */ }
if (hasItem(i))         { /* convert to 1bit, hence all index values are true */ }

are not equal. Read it there. He shows two problems:

  1. There is one representation of FALSE but many of TRUE, since the language does not support bool, and it internally will do logical conversions to 1 bit
  2. The example he shows uses a function that returns an index [0..(N-1)] but it’s -1 if something searched for is not found. This is standard C practice, but it’s bad practice. It invites problems. See this code:

And a code example to solve it

An example in XC where the problem shown in “Wot no bool” is not present. This compiles and runs. It’s because there is no implicit or explicit type conversions. Observe the parameter list for return values and how nice arrays may be handled in XC, like I don’t have to use any pointers in this case.  (However, there is a problem, see USE_WRONG_RETURN_LIST explained below the code)

#include <stdio.h>
#include <iso646.h> // For "not" etc

#define USE_BOOL_TYPEDEF // No difference since no type conversions done
#ifdef USE_BOOL_TYPEDEF
    typedef enum {false,true} bool; // 0,1 
#else
    #define bool unsigned // long, int, unsigned, char, bool ok 
    #define false 0
    #define true 1 // Or "not false"
#endif

typedef unsigned index_t; // long, int, unsigned, char, bool compile ok..
// #define USE_WRONG_RETURN_LIST  // ..with this defined

{index_t, bool} isItemIn (
    const char            search_for, 
    const char            text[], 
    static const unsigned len) {

    index_t  index     = 0;
    bool     found     = false;
    bool     searching = true;

    while (searching) {
        if (text[index] == search_for) {
            found     = true;
            searching = false;
        } else {
            index++;
            if (index == len) {
                searching = false;
            } else {} // No code, maybe next round
        }
    }
    return {index, found}; // According to {index_t, bool} of return list
}

int main() {
    const char text[] = {'A','B','C','D','E','F','G','H','I','J'};
    index_t  index;
    bool     found;

    {index, found} = isItemIn ('C', text, sizeof(text));
    if (found) {
        printf("C found in %u\n", index);
    } else {
        printf("C not found\n");
    }
    #ifdef USE_WRONG_RETURN_LIST
        {found, index} = isItemIn ('X', text, sizeof(text));
    #else
        {index, found} =  isItemIn ('X', text, sizeof(text));
    #endif
    if (not found) {
        printf("X not found\n");
    } else {
        printf("X found in %u\n", index);
    }
    // LOG with 
    // USE_WRONG_RETURN_LIST not defined
    // USE_BOOL_TYPEDEF or not no difference
    //   C found in 2
    //   X not found
    return 0;
}

Defining USE_WRONG_RETURN_LIST shows that the compiler will use any integer type for any definition I may have of bool. So, in this case I won’t get any help from the compiler. I may compile wrong code. I guess that’s not much worse than having any list of (..,signed,signed,..) parameter in any function, where (..,degC,degF,..) or (..,degF,degC,..) may be done wrongly. However, there is a small difference. With USE_WRONG_RETURN_LIST the compiler could have taken it if bool were understood by the compiler.

So, not even a typedef of the bool makes it a unique type to the compiler. It’s still an integer-type type.

I added this code and comment to the Wot no bool? thread. The answer from robertxmos was:

Correct, typedef should be viewed as a handy alias.
One could argue that instead of tightening a type system it does the opposite!
If you want unique types you need to reach for struct/union (class if using c++ & interface if using xC)
Even ‘enum’ is weak (unlike the ‘enum class’ in c++11)

And easier-to-get-right code to also solve it

This was triggered by the answer above.

Since interface is not for the below code, I collected the {index,found} into the struct isItemIn_return_t. I guess it’s less error prone since it’s easier to assign x.index=index than x.index=found. Even if I don’t do that, I do use it explicitly like isItemIn_return.index++. Doing it wrongly with isItemIn_return.found++ is easy to spot since I would think to increment a verb would ring a bell.

This uses the same amount of code and data as the above code. Maybe this indicates that the compiler also builds a struct for the parameter list? I kind of like it:

#include <stdio.h>
#include <iso646.h> // For "not" etc
typedef enum {false,true} bool; // 0,1

typedef unsigned index_t; // long, int, unsigned, char, bool compile ok..

typedef struct { // easier to get right than the anonymous return parameter list
    index_t index;
    bool    found;
} isItemIn_return_t;

isItemIn_return_t isItemIn (
    const char            search_for,
    const char            text[],
    static const unsigned len) {

    isItemIn_return_t isItemIn_return = {0, false}; 
    // Not type safe as {true, 0} will compile

    bool searching = true;

    while (searching) {
        if (text[isItemIn_return.index] == search_for) {
            isItemIn_return.found = true;
            searching = false;
        } else {
            isItemIn_return.index++;
            if (isItemIn_return.index == len) {
                searching = false;
            } else {} // No code, maybe next round
        }
    }
    return isItemIn_return;
}

int main() {
    const char text[] = {'A','B','C','D','E','F','G','H','I','J'};
    isItemIn_return_t isItemIn_retval;

    isItemIn_retval = isItemIn ('C', text, sizeof(text));
    if (isItemIn_retval.found) {
        printf("C found in %u\n", isItemIn_retval.index);
    } else {
        printf("C not found\n");
    }

    isItemIn_retval = isItemIn ('X', text, sizeof(text));

    if (not isItemIn_retval.found) {
        printf("X not found\n");
    } else {
        printf("X found in %u\n", isItemIn_retval.index);
    }
    // LOG with
    // USE_BOOL_TYPEDEF or not no difference
    //   C found in 2
    //   X not found
    return 0;
}

Data types long long, float, double and long double

These data types are supported, even if the xTIMEcomposer user guide (document XM009801A, 2015/10/29 chapter XS1 Data Types) (see XC is C plus X [12]) says that they are not. Here is some code showing it:

#include <stdio.h>
void test (void) {

    long        mylong     = 0xfffffffe; // -2
    long long   mylonglong = 0xfffffffe; // Is 4294967294 since 64 bits

    for (unsigned i=0; i<5; i++) {
        printf ("LL(%u): mylong:%ld - mylonglong:%lld\n", i, mylong, mylonglong);

        mylong++;
        mylonglong++;
    }

    float       myfloat      = 16777214.0; // -2 = 0xfffffe (all 24 fraction bits if EEEE 754 single-precision)
    int         myfloatint   = (int) myfloat;
    double      mydouble     = myfloat;
    long double mylongdouble = myfloat;

    for (unsigned i=0; i<5; i++) {
        printf ("F(%u) myfloat:%f - myfloatint:%d - mydouble:%lf - mylongdouble:%llf\n",
                i, myfloat, myfloatint, mydouble,  mylongdouble);

        myfloat      = myfloat + 1.0;
        myfloatint   = (int) myfloat;
        mydouble     = mydouble + 1.0;
        mylongdouble = mylongdouble + 1.0;
    }
}
int main() {
    par {
        test();
    }
    return 0;
}

Here is the printout:

LL(0): mylong:-2 - mylonglong:4294967294
LL(1): mylong:-1 - mylonglong:4294967295
LL(2): mylong:0 - mylonglong:4294967296
LL(3): mylong:1 - mylonglong:4294967297
LL(4): mylong:2 - mylonglong:4294967298
F(0) myfloat:16777214.000000 - myfloatint:16777214 - mydouble:16777214.000000 - mylongdouble:16777214.000000
F(1) myfloat:16777215.000000 - myfloatint:16777215 - mydouble:16777215.000000 - mylongdouble:16777215.000000
F(2) myfloat:16777216.000000 - myfloatint:16777216 - mydouble:16777216.000000 - mylongdouble:16777216.000000
F(3) myfloat:16777216.000000 - myfloatint:16777216 - mydouble:16777217.000000 - mylongdouble:16777217.000000
F(4) myfloat:16777216.000000 - myfloatint:16777216 - mydouble:16777218.000000 - mylongdouble:16777218.000000

It shows that the 23-bit fraction field of myfloat does not wrap around on overflow! It maxes out at 16777216.000000. See this discussed in the xCore Exchange post XC and long long, float, double and long double. Thanks and courtesy to “akp” there!

See Wikipedia’s IEEE 754 single-precision binary floating-point format: binary32 and Double-precision floating-point format

When to pass by reference, and when not

Observe that in order to modify a parameter inside a function or an interface function you have to use a reference parameter by adding the ‘&’ (like in C++). The below is standard procedure, you need a pointer (=reference) unless it’s a basic array (where element[0] is an implicit pointer). But one certainly need to overlearn or overtest this to get it right:

#include <stdio.h>
typedef struct {unsigned arr[1];} my_array_in_struct_t;
typedef enum   {zero, one}        my_enum_e;

void my_func (my_enum_e &myval) { // NO WARNING IF '&' FORGOTTEN AND NEEDED..
    myval = one; // ..SINCE WE WANT TO MODIFY THE ORIGINAL
}
my_enum_e my_func_x (my_enum_e myval) { // NO REFERENCE NEEDED BY USAGE
    myval = one;  // ORIGINAL OBVIOUSLY NOT UPDATED
    return myval; // PROPER RETURN VALUE EXPLICITLY UPDATES THE ORIGINAL
}
void my_func_y (
        unsigned arr[1], // REFERENCE NOT NEEDED FOR ARRAY!
        my_array_in_struct_t &my_array_in_struct) // REFERENCE NEEDED FOR STRUCT!
{
    arr[0] = 1; // UPDATES ORIGINAL
    my_array_in_struct.arr[0] = 1; // UPDATES ORIGINAL
}

void test (void) {
    my_enum_e myval = zero;
    printf ("myval:%d, ", myval);
    my_func (myval);
    printf ("myval:%d\n", myval);    
    // Prints: "myval:0, myval:1"
    
    myval = zero;
    printf ("myval:%d, ", myval);
    myval = my_func_x (myval);
    printf ("myval:%d\n", myval);
    // Also prints "myval:0, myval:1"

    unsigned arr[1];
    my_array_in_struct_t my_array_in_struct;
    arr[0]=0;
    my_array_in_struct.arr[0] = 0;

    printf ("arr:%d, my_array_in_struct:%d - ", arr[0], my_array_in_struct.arr[0]);
    my_func_y (arr, my_array_in_struct);
    printf ("arr:%d, my_array_in_struct:%d\n", arr[0], my_array_in_struct.arr[0]);
    // Prints "arr:0, my_array_in_struct:0 - arr:1, my_array_in_struct:1"
}

int main() {
    par {
        test();
    }
    return 0;
}

This is of course well described in the note 141 ref [1] (XMOS Programming Guide, 5 Data handling and memory safety, 5.1 Extra data handling features, 5.1.1 References)

Timer handling

Repeated busy-poll with timeout

Timeout intro

Time/timer handling has been extensively discussed in Know your timer’s type, also for XC. The background material is there.

Some times we want to read from f.ex. a register on an I/O chip until some value is seen. However, we can’t read forever, so we need to wrap the “busy-poll” in some timeout-mechanism to avoid it repeating forever.

Below there are four chapters, with full code (for xTIMEcomposer) and logs:

  1. C that works with and XC that fails
    Cannot be used!
    The C-code for the Arduino is “blocking” and there is no support for multi-threading. However, my first trial to port this to XC failed
  2. XC that works with 0.65536 ms resolution
    Use this “MILLIS” pattern if you can accept about 152 ticks per 100 ms.
    But a version where the 1 ms resolution of the Arduino millis() had to be ported as 0.65536 ms ticks in XC works. The problem is how “modulo 1 ms” arithmetics is done when the timer word with is 32 bits and it increments by the XMOS processor every 10 ns. So 10 ns  * 2exp16 -> 0.65536 ms. This timing is done “inline”,  but since anything XC is multi-threading (multi-task) that doesn’t really harm it. The problem is I can’t order a 100 ms timeout (but I could do 152 ticks = 99.61472 ms)
  3. XC that works with 1 ms resolution
    Use this standard XC pattern if you want to use XC’s construct as it is.
    Here we have proper 1 ms resolution, done by “proper” handling by an XC timer, by storing a future timeout-value in proper 32 bits width. The code is “inline” as above
  4. XC server with state-based timing serving two clients
    Use this standard XC pattern if you only have a timeout per task.
    Here the timing is handled in a proper select case. I guess this is closest to idiomatic XC. The server serves two clients, one that fails and one that succeeds in reading a register. Impressingly the handling of the two clients is “fair”, ie. none of them jam the other
  5. XC true MILLIS function (courtesy akp)
    Use this real MILLIS pattern if you can accept a local state per task. Uses intermediate 64 bit values.
    Having done all the four code examples above I started a thread about this at the XCore Exchange. “akp” had a very smart solution that does a true MILLIS and works and looks like the Arduino code. I copied it back here, see  XC true MILLIS function (courtesy akp)

For me this chapter started with use of the Arduino millis(); timeout code in a library. Not that I  haven’t made timeouts before, in several languages and using several mechanisms (like a built-in timer/counter HW unit and its interrupt drivers or (the opposite) using CSP-like channel timeouts (like the idiomatic XC select case timeouts)) – and the XC language even has language support for this through the timer primitive. But there was still something new about the mechanism built with the Arduino millis() function:

C that works with and XC that fails

Cannot be used!

Here is an example from LowPowerLab’s RFM69 library (here, file RFM69.cpp). I have added some parenthesis to remove my thinking from operator precedence that I am utterly bad on (partly because I was “raised” on occam that doesn’t have it! (to the complaints of those who hate (too many!?) parenthesis levels)):

uint32_t txStart = millis();
while ((digitalRead(_interruptPin) == 0) && (millis() - txStart < RF69_TX_LIMIT_MS)); 
// wait for DIO0 to turn HIGH signalling transmission finish

The millis() value “Returns the number of milliseconds since the Arduino board began running the current program. This number will overflow (go back to zero), after approximately 50 days.” (2exp32 ms = 4294967296 ms = 4294967,296 seconds = 1193,046471111111111 hours = 49,71026962962963 days).

I made a test of this code (there) at the EDA playground. It doesn’t run the code above, it just tests the wrap-around properties. It works perfectly fine also when the value wraps around, if I use (A) int32_t (wraps around after MAXPOS(int32_t) 0x7FFFFFFF to MAXNEG(int32_t) 0x80000000) or use (B) uint32_t (wraps around at after MAXPOS(uint32_t) 0xFFFFFFFF to zero). As seen in the Know your timer’s type blog note, as long as the value is monotonously increasing it doesn’t matter. How the most significant bit is interpreted is irrelevant. (I have not insert the code here.)

By the way, I have started to discuss the library porting to XC at My aquarium’s data radioed through the shelf, chapter “Porting to XMOS XC”.

If I made a  millis() function in XC it did not work! The problem with the below code is that it does not make any correct modulo some-word-width type of arithmetics, since the timer tick is 10 ns and not 1 ms. I can’t just divide! So this does not handle wrap-around correctly:

// http://www.xcore.com/viewtopic.php?f=26&t=6470
#include <stdio.h>
#include <xs1.h>
#include <iso646.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

typedef enum {false,true} bool;

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

// https://www.arduino.cc/reference/en/language/functions/time/millis/
signed millis () {
    timer tmr;
    signed current_time;
    tmr :> current_time;
    return (current_time / XS1_TIMER_KHZ); // Never works! XS_TIMER_KHZ is 100*1000
}

unsigned digitalRead (void) {
    return (0);
}

#define RF69_TX_LIMIT_MS 100
#define TEST_TYPE signed // Too wide and does not match millis()
#define MILLIS() millis()

void test_task (void) {

    TEST_TYPE millis_;
    TEST_TYPE txStart;
    unsigned  testCnt = 0;
    bool      not_timed_out;

    while (testCnt < 500) {
        txStart = MILLIS();
        do {
            millis_ = MILLIS();
            not_timed_out = (millis_ - txStart) < RF69_TX_LIMIT_MS;
            debug_print    ("testCnt(%d), millis(%d), txStart(%d), millis_-txStart(%d), timedOut(%d)\n",
                    testCnt, millis_, txStart, millis_ - txStart, !not_timed_out);
            delay_milliseconds(20);
        } while ((digitalRead() == 0) && not_timed_out);
        testCnt++;
    }
}

int main() {
    par {
        test_task();
    }
    return 0;
}

/* DOES NOT WORK!
testCnt(351), millis(42729), txStart(42629), millis_-txStart(100), timedOut(1)
testCnt(352), millis(42749), txStart(42749), millis_-txStart(0), timedOut(0)
testCnt(352), millis(42769), txStart(42749), millis_-txStart(20), timedOut(0)
testCnt(352), millis(42789), txStart(42749), millis_-txStart(40), timedOut(0)
testCnt(352), millis(42809), txStart(42749), millis_-txStart(60), timedOut(0)
testCnt(352), millis(42829), txStart(42749), millis_-txStart(80), timedOut(0)
testCnt(352), millis(42849), txStart(42749), millis_-txStart(100), timedOut(1) LAST TIMEOUT EVER!
testCnt(353), millis(42869), txStart(42869), millis_-txStart(0), timedOut(0)
testCnt(353), millis(42890), txStart(42869), millis_-txStart(21), timedOut(0)
testCnt(353), millis(42910), txStart(42869), millis_-txStart(41), timedOut(0)
testCnt(353), millis(42930), txStart(42869), millis_-txStart(61), timedOut(0)
testCnt(353), millis(0), txStart(42869), millis_-txStart(-42869), timedOut(0)
testCnt(353), millis(20), txStart(42869), millis_-txStart(-42849), timedOut(0)
testCnt(353), millis(40), txStart(42869), millis_-txStart(-42829), timedOut(0)
testCnt(353), millis(60), txStart(42869), millis_-txStart(-42809), timedOut(0)
testCnt(353), millis(80), txStart(42869), millis_-txStart(-42789), timedOut(0)
testCnt(353), millis(100), txStart(42869), millis_-txStart(-42769), timedOut(0)
testCnt(353), millis(120), txStart(42869), millis_-txStart(-42749), timedOut(0)
testCnt(353), millis(140), txStart(42869), millis_-txStart(-42729), timedOut(0)
*/

XC that works with 0.65536 ms resolution

Use this “MILLIS” pattern if you can accept about 152 ticks per 100 ms.

However, the next code works. But it has to be, not millis(,) but 65536-micros(), that I have called millis_0_65536. This compares to XC/XMOS 100 MHz system ticks, ie. 10 ns shifted down 16 bits (or divided by 65536). This wraps a full 32 bits XC timer down to a 16 bits value. The arithmetics is then modulo short (16 bits), done so by the compiler. In addition I had to define a diff variable that was also of type short (16 bits), if not it did not work. I guess, since the millis_ - txStart was built as a primitive type with (32 bits) difference, not a derived type (16 bits) difference. Anyhow, this works, but not with 1 ms resolution, but with 0.65536 ms resolution, so I have defined a “fast millisecond” here:

// http://www.teigfam.net/oyvind/home/technology/165-xc-code-examples/#xc_that_works_with_065536_ms_resolution
// http://www.xcore.com/viewtopic.php?f=26&t=6470
#include <stdio.h>
#include <xs1.h>
#include <iso646.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

typedef enum {false,true} bool;

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

// DEFINING A FMS-TICK AS A "fast ms" (fms, FMS) - AND IT IS 0.65536 ms (2exp16=65536)
// One fms-tick is 100 MHz XMOS system-tick (10 ns) into a 16 bits word every 65536 system-tick
//
typedef signed short time16_fms_t; // fms=fast ms
//
#define FAST_MILLIS_PER_10MS    15 //   10 / .65536
#define FAST_MILLIS_PER_100MS  153 //  100 / .65536
#define FAST_MILLIS_PER_1S    1526 // 1000 / .65536
//
#define MS_TO_FMS(ms) ((ms*FAST_MILLIS_PER_1S)/1000)

time16_fms_t fms() { // "MILLIS": fms=fast ms. Returns one tick as 0.65536 ms (10ns * 65536)
    timer tmr;           // 32 bits
    signed current_time; // 32 bits
    tmr :> current_time; // 32 bits
    return (time16_fms_t) (current_time >> 16); // 16 bits. Keep sign bit (or use / 65536)
}

unsigned digitalRead (void) {
    return (0);
}

#define RF69_TX_LIMIT_MS    100
#define RF69_TX_LIMIT_FMS   MS_TO_FMS (RF69_TX_LIMIT_MS)

void test_task (void) {

    time16_fms_t now_fms;
    time16_fms_t txStart_fms;
    time16_fms_t diff_fms;

    unsigned  testCnt = 0;
    bool      not_timed_out;
    debug_print ("100 ms is %d ticks\n",RF69_TX_LIMIT_FMS);

    while (testCnt < 500) {
        txStart_fms = fms();
        do {
            now_fms = fms();
            diff_fms = now_fms - txStart_fms;
            not_timed_out = diff_fms < RF69_TX_LIMIT_FMS;
            debug_print    ("testCnt(%d), now_fms(%d), txStart_fms(%d), diff_fms(%d), timedOut(%d)\n",
                    testCnt, now_fms, txStart_fms, diff_fms, !not_timed_out);
            delay_milliseconds (RF69_TX_LIMIT_MS/2); // TRUE 50 ms!
        } while ((digitalRead() == 0) && not_timed_out);
        testCnt++;
    }
}

int main() {
    par {
        test_task();
    }
    return 0;
}
/* WORKS:
100 ms is 152 ticks
testCnt(0), now_fms(581), txStart_fms(581), diff_fms(0), timedOut(0)
testCnt(0), now_fms(657), txStart_fms(581), diff_fms(76), timedOut(0)
testCnt(0), now_fms(734), txStart_fms(581), diff_fms(153), timedOut(1)
testCnt(1), now_fms(810), txStart_fms(810), diff_fms(0), timedOut(0)
...
testCnt(139), now_fms(32586), txStart_fms(32434), diff_fms(152), timedOut(1)
testCnt(140), now_fms(32663), txStart_fms(32663), diff_fms(0), timedOut(0)
testCnt(140), now_fms(32739), txStart_fms(32663), diff_fms(76), timedOut(0)
testCnt(140), now_fms(-32719), txStart_fms(32663), diff_fms(154), timedOut(1)
testCnt(141), now_fms(-32643), txStart_fms(-32643), diff_fms(0), timedOut(0)
testCnt(141), now_fms(-32567), txStart_fms(-32643), diff_fms(76), timedOut(0)
testCnt(141), now_fms(-32490), txStart_fms(-32643), diff_fms(153), timedOut(1)
*/

XC that works with 1 ms resolution

Use this standard XC pattern if you want to use XC’s contruct as it is.

The next code works, and with the expected 1 ms resolution. I am more used to comparing against the future timeout time than comparing against used time being below some max allowed. So, here is some test code in XC showing this, that works! It looks more verbose, but it basically isn’t. Also, there is no milliseconds timer here, there are all 32 bit 100 MHz (10 ns ticks) values, ie. the basic timer of XC/XMOS. A ms is exactly 100*1000 10 ns cycles. The first part is test_task and it has a lot of prints. The other is the same functionality in real_task with little printing:

#include <stdio.h>
#include <xs1.h>
#include <iso646.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

typedef enum {false,true} bool;

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

#define AFTER_32(a,b) ((a-b)>0)
#define LEFT_32(a,b)  (b-a)
#define NOW_32(t,n) do {t :> n;} while(0) // A tick is 10ns (100 MHz timer in XC). 32 bits wide

#define WAIT_FOR_REGISTER_MS 1000
#define NO_REGISTER_VALUE    0
#define NO_REGISTER          NO_REGISTER_VALUE
#define SOME_REGISTER       (NO_REGISTER_VALUE+1)

unsigned readRegister (const unsigned reg) {
    delay_milliseconds(200);
    return (reg);
}

{bool, signed} test_task (void) { // return_return_read_ok, return_timed_out_cnt
    unsigned read_value;

    timer tmr; // A tick is 10ns (100 MHz timer in XC). 32 bits wide
    // Rotates from 0 to MOSTPOS then one ++ to MOSTNEG then up to zero again
    // MOSTPOS =  2exp31-1 = 2147483647 = 0x7FFFFFFF
    // MOSTNEG = -2exp31   =-2147483648 = 0x80000000, then finally to 0xFFFFFFFF(-1)

    signed   now_time_ticks;
    signed   start_time_ticks;
    signed   timeout_at_ticks;
    bool     timed_out      = false;
    bool     return_read_ok = false;
    unsigned return_timed_out_cnt = 0;

    debug_print ("%s\n", "test_task started");

    while ((return_timed_out_cnt < 30) and (not return_read_ok)) {

        NOW_32 (tmr, start_time_ticks);
        timeout_at_ticks = start_time_ticks + (WAIT_FOR_REGISTER_MS * XS1_TIMER_KHZ);
        debug_print ("RESTART START(%d) and TIMEOUTAT(%d)\n",
                start_time_ticks, timeout_at_ticks);

        do {
            read_value = readRegister (NO_REGISTER); // Try SOME_REGISTER
            return_read_ok = (read_value != NO_REGISTER_VALUE);

            if (return_read_ok) {
                // Finished
            } else {
                NOW_32 (tmr, now_time_ticks);
                timed_out = AFTER_32 (now_time_ticks, timeout_at_ticks);

                if (timed_out) {
                    return_timed_out_cnt++;
                } else {} // Continue
            }

            debug_print ("  START(%d), NOW(%d), LEFT(%d), TIMEDOUT(%d) - TIMEOUTS(%d)\n",
                    start_time_ticks,
                    now_time_ticks,
                    LEFT_32(now_time_ticks, timeout_at_ticks),
                    timed_out,
                    return_timed_out_cnt);
        } while ((not return_read_ok) and (not timed_out));
    }
    debug_print ("test_task finished with READOK(%d) and TIMEOUTS(%d)\n",
            return_read_ok, return_timed_out_cnt);
    return  {return_read_ok, return_timed_out_cnt};
}

/*
RESTART START(2039473606) and TIMEOUTAT(2139473606)
  START(2039473606), NOW(2059478404), LEFT(79995202), TIMEDOUT(0) - TIMEOUTS(20)
  START(2039473606), NOW(2079486402), LEFT(59987204), TIMEDOUT(0) - TIMEOUTS(20)
  START(2039473606), NOW(2099494400), LEFT(39979206), TIMEDOUT(0) - TIMEOUTS(20)
  START(2039473606), NOW(2119502398), LEFT(19971208), TIMEDOUT(0) - TIMEOUTS(20)
  START(2039473606), NOW(2139510396), LEFT(-36790), TIMEDOUT(1) - TIMEOUTS(21)
RESTART START(2139518043) and TIMEOUTAT(-2055449253)
  START(2139518043), NOW(-2135444355), LEFT(79995102), TIMEDOUT(0) - TIMEOUTS(21)
  START(2139518043), NOW(-2115436255), LEFT(59987002), TIMEDOUT(0) - TIMEOUTS(21)
  START(2139518043), NOW(-2095428155), LEFT(39978902), TIMEDOUT(0) - TIMEOUTS(21)
  START(2139518043), NOW(-2075420055), LEFT(19970802), TIMEDOUT(0) - TIMEOUTS(21)
  START(2139518043), NOW(-2055411955), LEFT(-37298), TIMEDOUT(1) - TIMEOUTS(22)
RESTART START(-2055404238) and TIMEOUTAT(-1955404238)
  START(-2055404238), NOW(-2035399198), LEFT(79994960), TIMEDOUT(0) - TIMEOUTS(22)
  START(-2055404238), NOW(-2015390996), LEFT(59986758), TIMEDOUT(0) - TIMEOUTS(22)
  START(-2055404238), NOW(-1995382794), LEFT(39978556), TIMEDOUT(0) - TIMEOUTS(22)
  START(-2055404238), NOW(-1975374592), LEFT(19970354), TIMEDOUT(0) - TIMEOUTS(22)
  START(-2055404238), NOW(-1955366390), LEFT(-37848), TIMEDOUT(1) - TIMEOUTS(23)
*/

{bool, signed} real_task (void) { // return_return_read_ok, return_timed_out_cnt
    unsigned read_value;

    timer tmr; // A tick is 10ns (100 MHz timer in XC). 32 bits wide
    // Rotates from 0 to MOSTPOS then one ++ to MOSTNEG then up to zero again
    // MOSTPOS =  2exp31-1 = 2147483647 = 0x7FFFFFFF
    // MOSTNEG = -2exp31   =-2147483648 = 0x80000000, then finally to 0xFFFFFFFF(-1)

    signed   start_time_ticks;
    signed   timeout_at_ticks;
    bool     timed_out      = false;
    bool     return_read_ok = false;
    unsigned return_timed_out_cnt = 0;

    debug_print ("%s\n", "real_task started");

    tmr :> start_time_ticks;
    timeout_at_ticks = start_time_ticks + (WAIT_FOR_REGISTER_MS * XS1_TIMER_KHZ);

    do {
        read_value     = readRegister (NO_REGISTER); // Try SOME_REGISTER
        return_read_ok = (read_value != NO_REGISTER_VALUE);

        if (return_read_ok) {
            // Finished
        } else {
            signed now_time_ticks;
            tmr :> now_time_ticks;
            timed_out = AFTER_32 (now_time_ticks, timeout_at_ticks);

            if (timed_out) {
                return_timed_out_cnt++;
            } else {} // Continue
        }
    } while ((not return_read_ok) and (not timed_out));

    debug_print ("real_task finished with READOK(%d) and TIMEOUTS(%d)\n",
            return_read_ok, return_timed_out_cnt);
    return  {return_read_ok, return_timed_out_cnt};
}

int main() {
    par {
        test_task();
        real_task();
    }
    return 0;
}

/*
real_task started
test_task started
RESTART START(38716970) and TIMEOUTAT(138716970)
  START(38716970), NOW(58721387), LEFT(79995583), TIMEDOUT(0) - TIMEOUTS(0)
  START(38716970), NOW(78728680), LEFT(59988290), TIMEDOUT(0) - TIMEOUTS(0)
  START(38716970), NOW(98735973), LEFT(39980997), TIMEDOUT(0) - TIMEOUTS(0)
  START(38716970), NOW(118743266), LEFT(19973704), TIMEDOUT(0) - TIMEOUTS(0)
real_task finished with READOK(0) and TIMEOUTS(1)
  START(38716970), NOW(138750700), LEFT(-33730), TIMEDOUT(1) - TIMEOUTS(1)
RESTART START(138757765) and TIMEOUTAT(238757765)
  START(138757765), NOW(158762323), LEFT(79995442), TIMEDOUT(0) - TIMEOUTS(1)
  START(138757765), NOW(178769897), LEFT(59987868), TIMEDOUT(0) - TIMEOUTS(1)
  START(138757765), NOW(198777471), LEFT(39980294), TIMEDOUT(0) - TIMEOUTS(1)
  START(138757765), NOW(218785045), LEFT(19972720), TIMEDOUT(0) - TIMEOUTS(1)
  START(138757765), NOW(238792619), LEFT(-34854), TIMEDOUT(1) - TIMEOUTS(2)
*/

XC server with state-based timing serving two clients

Use this standard XC pattern if you only have a timeout per task.

This code is the most complete, and of course it works. There are two clients asking for registers  from a server. Client[0] asks for “register 0” that always fails (with “ok(0)” in the log) – and client[1] asks for “register 1” that always succeeds (with “ok(1)” in the log).

The timing section of interest here is this select case in Deliver_register_server_task:

case (state==s_read) => tmr when timerafter(timeout_at_ticks) :> void:

That section reads the register when (state==s_read) and finishes if the reading succeeds, but does retries every 100 ms until retrial_cnt is 5. This way to do a timeout I guess is the one I have used the most.

#include <stdio.h>
#include <xs1.h>
#include <iso646.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc
#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

typedef enum {false,true} bool;

typedef interface read_reg_if {
    [[guarded]]              void            read_register (const unsigned reg_num);
    // Guard needed so that clients shall not overlap

    [[notification]] slave   void            read_register_ready (void);
    [[clears_notification]] {unsigned, bool} get_register (void); // Returns value, ok
} read_reg_if;

unsigned digitalRead (const unsigned reg_num) {
    unsigned reg_value = reg_num; // Easy rule to code by. 0 causes "ok(0)"/ERR and 1 causes "ok(1)"/OK
    return (reg_value);
}

#define NUM_CLIENTS 2

typedef enum {c_next, c_wait} client_state_e;
#define CLIENT_POLL_EVERY_SECS 4

[[combinable]] void Read_register_client_task (
        client read_reg_if i_read_reg, const unsigned iof_client) {

    timer          tmr;
    signed         timeout_at_ticks;
    client_state_e state = c_next;
    signed         num_services = 0; // iof_client[0,1] are actually served FAIR!
                                     // And I have no private code to ensure it!

    tmr:> timeout_at_ticks;
    timeout_at_ticks += CLIENT_POLL_EVERY_SECS * XS1_TIMER_HZ;

    while(1) {
        select {
            case (state==c_next) => tmr when timerafter(timeout_at_ticks) :> void: {
                timeout_at_ticks += CLIENT_POLL_EVERY_SECS * XS1_TIMER_HZ; // No time skew
                num_services++;
                debug_print ("Client    [%d] read_register pending with num_services(%d)\n",
                        iof_client, num_services);

                i_read_reg.read_register(iof_client); // Any number that's unique

                unsigned now_log_timer_ticks; // Using unsigned to avoid minus sign:
                tmr :> now_log_timer_ticks;
                debug_print ("[%5u ms]\nClient    [%d] read_register sent\n",
                        now_log_timer_ticks/XS1_TIMER_KHZ, // Max 42949
                        iof_client);
                state = c_wait;
            } break;

            case (state==c_wait) => i_read_reg.read_register_ready() : {
               unsigned register_value;
               bool     register_ok;

               {register_value, register_ok} = i_read_reg.get_register();
               debug_print ("Client    [%d] got register(%d) ok(%d)\n",
                       iof_client, register_value, register_ok);
               state = c_next;
            } break;
        }
    }
}

typedef enum {s_next, s_read, s_deliver} server_state_e; // s_deliver not used explcitly
#define SERVER_TIMEOUT_MS 1000
#define SERVER_RETRY_MS    200

[[combinable]] void Deliver_register_server_task (
        server read_reg_if i_read_reg[NUM_CLIENTS]) {

    timer          tmr;
    signed         timeout_at_ticks;
    server_state_e state = s_next;
    unsigned       register_value;
    int            index_of_client;
    bool           register_ok;
    unsigned       reg_num;
    unsigned       retrial_cnt = 0;

    while(1) {
        select {
            case (state==s_next) => i_read_reg[int index_of_client_].read_register
                    (const unsigned reg_num_) : {

                index_of_client = index_of_client_;
                reg_num = reg_num_;
                tmr :> timeout_at_ticks; // Immediately digitalRead
                retrial_cnt = 0;
                state = s_read;
                debug_print ("Server for[%d] read_register\n", index_of_client);
            } break;

            case (state==s_read) => tmr when timerafter(timeout_at_ticks) :> void: {
                register_value = digitalRead(reg_num);
                if (register_value == 0) {
                    retrial_cnt++;
                    if (retrial_cnt == (SERVER_TIMEOUT_MS/SERVER_RETRY_MS)) { // ==5
                        debug_print ("Server for[%d] retry(%d) max\n", index_of_client, retrial_cnt);
                        register_ok = false;
                        i_read_reg[index_of_client].read_register_ready();
                        state = s_deliver;
                    } else {
                        debug_print ("Server for[%d] retry(%d)\n", index_of_client, retrial_cnt);
                        timeout_at_ticks += SERVER_TIMEOUT_MS * XS1_TIMER_KHZ; // No time skew
                    }
                } else {
                    debug_print ("Server for[%d] register == OK \n", index_of_client);
                    register_ok = true;
                    i_read_reg[index_of_client].read_register_ready();
                    state = s_deliver; // Not tested, but leads to get_register
                }
            } break;

            case i_read_reg[int index_of_client_].get_register (void) ->
                    {unsigned return_register_value, bool return_ok}: {

                // No need to guard with (state==s_deliver) since that protocol is ensured by client
                // usage and the client/server rule with [[notification]] and [[clears_notification]]
                // would cause deadlock if not followed

                debug_print ("Server for[%d] register value(%d) ok(%d) sent\n",
                        index_of_client, register_value, register_ok);
                return_register_value = register_value;
                return_ok = register_ok;
                state = s_next;
            } break;
        }
    }
}

int main() {
    read_reg_if i_read_reg[NUM_CLIENTS];
    par {
        Read_register_client_task    (i_read_reg[0], 0); // Will fail    with "ok(0)"/ERR
        Read_register_client_task    (i_read_reg[1], 1); // Will succeed with "ok(1)"/OK
        Deliver_register_server_task (i_read_reg);
    }
    return 0;
}

And here is some of the log. You can see that the select is “fair” since num-services actually follow each other for client[0] and client[1]. This is automatically done by how the code is built by the compiler, I guess – since I haven’t done anything to ensure this. This is rather impressing, especially perhaps since the server does different jobs for each of them! But I assume it has to do with some queuing, see “pending”. I have discussed fair or fairness a lot in Nondeterminism.

Server for[0] read_register
[14918 ms]
Client    [0] read_register sent
Server for[0] retry(1)
Server for[0] retry(2)
Server for[0] retry(3)
Server for[0] retry(4)
Client    [1] read_register pending with num_services(112)
Server for[0] retry(5) max
Server for[0] register value(0) ok(0) sent
Client    [0] got register(0) ok(0)
Server for[1] read_register
Client    [0] read_register pending with num_services(112)
Server for[1] register == OK 
[18918 ms]
Client    [1] read_register sent
Server for[1] register value(1) ok(1) sent
Client    [1] got register(1) ok(1)
Server for[0] read_register
[18918 ms]
Client    [0] read_register sent
Server for[0] retry(1)
Server for[0] retry(2)
Server for[0] retry(3)
Server for[0] retry(4)
Client    [1] read_register pending with num_services(113)
Server for[0] retry(5) max
Server for[0] register value(0) ok(0) sent
Client    [0] got register(0) ok(0)
Server for[1] read_register
Client    [0] read_register pending with num_services(113)
Server for[1] register == OK

XC true MILLIS function (courtesy akp)

Use this real MILLIS pattern if you can accept a local state per task. Uses intermediate 64 bit values.

The code below was suggested by “akp” at the XCore Exchange, in the already mentioned thread that I started: Porting the Arduino millis(), possible?. It uses one local state per task and that state stores the values that need to survive between each call. It also uses a 64 bit intermediate value. The code in a comment (by me) one page 3 (dated  Thu Mar 22, 2018 9:42 pm) probably is the final code (still courtesy and thanks to akp!). However, I must admit that have made some new typedefs and defines here because I have learned to like this style:

#include <xs1.h>
#include <stdio.h>
#include <iso646.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

typedef enum {false,true} bool;

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

typedef unsigned long      systick32_t; 
typedef unsigned long long tmptick64_t; 
typedef unsigned long      millis32_t;

#define extend_32to64(tohigh32,tolow32) ((((tmptick64_t)tohigh32)<<32) bitor (tmptick64_t)tolow32)
#define XS1_TIMER_KHZ_LL ((tmptick64_t)XS1_TIMER_KHZ)

typedef struct millis_state_t {
    systick32_t tick_hi;
    systick32_t last_tick;
} millis_state_t;

#define MILLIS(s) millis_p(s)

millis32_t millis_p (millis_state_t &millis_state) {
    timer tmr;
    tmptick64_t total_ticks;
    systick32_t current_tick;
    tmr :> current_tick;
    if (current_tick < millis_state.last_tick) {
        ++millis_state.tick_hi;
    } else {}
    millis_state.last_tick = current_tick;
    total_ticks = extend_32to64 (millis_state.tick_hi,current_tick);
    return (millis32_t)(total_ticks / XS1_TIMER_KHZ_LL);
}

unsigned digitalRead (void) {
    delay_milliseconds (200);
    return (0);
}

#define RF69_TX_LIMIT_MS 1000

void test_task (int task_num) {

    millis32_t     millis_;
    millis32_t     txStart;
    unsigned       testCnt = 0;
    bool           not_timed_out;
    millis_state_t millis_state = {0,0};

    while (testCnt < 2000) {
        txStart = MILLIS(millis_state);
        do {
            millis_ = MILLIS(millis_state);
            not_timed_out = (millis_ - txStart) < RF69_TX_LIMIT_MS;
            debug_print ("task %d: testCnt(%d), millis(%d), txStart(%d), millis_-txStart(%d), timedOut(%d)\n",
                    task_num, testCnt, millis_, txStart, millis_ - txStart, !not_timed_out);
            delay_milliseconds(20);
        } while ((digitalRead() == 0) and not_timed_out);
        testCnt++;
    }
}

int main() {
    par {
        test_task(0);
        test_task(1);
    }
    return 0;
}

timerafter to now and skew

This is described in “Programming XC on XMOS Devices” by Douglas Watt (XC is C plus X , ref [14], page 18) like this:

A programming error may be introduced by inputting the new time instead of ignoring it with a cast to void, as in t when timerafter(event_time) :> time; Because the processor completes the input shortly after the time specified is reached, this operation actually increments the value of time by a small additional amount. This amount may be compounded over multiple loop iterations, leading to signal drift and ultimately a loss of synchronisation with a receiver. (Added by me: see next chapter: this is not a fault in the XC language!)

timerafter to now and skew code

Don’t use the now output for anything else than it being now!

Should you in any way use it to calculate the next timeout you will get time skew. This is the sum of all lost ticks.  Here is the code to show the skew:

// Almost as XMOS Programming Guide XM004440A 2015/9/18 page 33

#include <xs1.h>
#include <stdio.h>
#include <iso646.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

// See Know your timer's type
// ABOUT SIGNED OR UNSIGNED:
// signed int (=signed) or unsigned int (=unsigned) both ok, as long as they are monotoneously increasing. XC/XMOS 100 MHz increment every 10 ns for max 2exp32 = 4294967296, ie. 42.9.. seconds
typedef signed int time32_t;

#define ONE_SECOND_TICKS (1000 * XS1_TIMER_KHZ) // MAX INC IS 2exp31/(100 mill) = 21.47 seconds

void test_task (void) {

    timer     tmr;
    time32_t  event_time; // Name from XMOS lib_i2c, both time stamps are used

    tmr :> event_time;
    debug_print ("\ntimerafter time %08X and ONE_SECOND_TICKS is %08X (%d)\n", event_time, ONE_SECOND_TICKS, ONE_SECOND_TICKS);

    while (1) {
        select {
            case tmr when timerafter (event_time) :> time32_t now: {
                debug_print ("timerafter event_time %08X :> now %08X, lost %d ticks\n", event_time, now, now-event_time);

                time += ONE_SECOND_TICKS; // FUTURE TIMEOUT
                debug_print ("\ntimerafter time %08X\n", event_time);
            } break;
        }
    }
}

int main() {
    par {
        test_task();
    }
    return 0;
}

And here is the log. Observe that the 1 second increment 05F5E100 has these nice 00 as LSB, so it’s easy to see that the XC timerafter mechanism in itself has no skew whatsoever (57 as LSB). Repeated addition with a constant doesn’t usually have skew, but it’s nice to see it when applied to a timer as well:

timerafter event_time 0244E557 and ONE_SECOND_TICKS is 05F5E100 (100000000)
timerafter event_time 0244E557 :> now 0244F005, lost 2734 ticks

timerafter event_time 083AC657
timerafter event_time 083AC657 :> now 083AC659, lost 2 ticks

timerafter event_time 0E30A757
timerafter event_time 0E30A757 :> now 0E30A759, lost 2 ticks

timerafter event_time 14268857
timerafter event_time 14268857 :> now 14268859, lost 2 ticks

It’s interesting to see that the first now is way off from the timeout. I assume this has to do with the set-up of the select etc.?

Conclusion is that perhaps just outputting to void is less misleading:

while (1) {
    select {
        case tmr when timerafter (time) :> void: {
        ...
}

[[combinable]] on same core timing

This is a follow up from the above chapter. I thought, what happens if I start two instances of the above test_task and let them run by half a second out of phase?

Some of what I saw was so new to me that I decided to post this as [[combinable]] on same core timing on XCore Exchange. I have some ponderings there that I hope somebody will respond to.

XC may calculate interrupt latency even without interrupt functions

Observe that this is not a fault in the XC language! Observe that there is nothing wrong with the fact that the now time is not the same as the time that it waited until. This is by XC language’s design, and since these time values are not equal because timing-out and scheduling will take some time (and they may differ by quite a lot), it’s some times crucial to get that value. I did find an example in the XMOS lib_i2c library, file i2c_master_async.xc. In the explanation of adjust_for_slip we read ..”However, if the task falls behind due to other processing on the core then this function will adjust both the next event time and the fall time reference to slip so that subsequent timings are correct.” They know both the event_time (a name that I stole from that example) and now, so it’s possible to do something by knowing the slip. It’s about as easy as this: if we don’t control a pin with some timed port statement, then, when we should have changed a pin at event_time but are not allowed before now, then we are still able to do something in the code to compensate. I assume that just reading the system timer again would not have been as accurate as the timerafter‘s output to now by language design.

One should be aware of this. Tuning by placing tasks alone on a core or share a core with other tasks with [[combinable]] would yield different results. I have made a positive point of the fact that XC doesn’t have interrupt functions (like occam didn’t have them either) in a lecture at NTNU in 2018 (Thinking about it: Channels more than connect threads. They protect them) – with the view that interrupts are difficult and it’s hard to find worst case timing. They are prioritised and nested, and you need to save and restore registers that you’d use in your interrupt. An interrupt event is what’s connected on the sender side of a chan or connected to changes of a port (pin(s) or its local timeouts), or timer timeouts for that matter (as here). This scheme will have much faster response (faster is always difficult to defend, so I’d say cleaner instead) – available out of the box in the XC universe. The lost time shown here is a close as XC comes to ant “interrupt latency”. The two ticks you lose in some examples here are 2 * 10 ns = 20 ns. Give me any other processor where this is possible, with this elegant a scheme! One logical core is possible per “interrupt”, and they would have guaranteed max latency!

But still I have those interesting values here…:

combinable on same core timing code

#include <platform.h> // core
#include <stdio.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

typedef signed int time32_t;

#define ONE_SECOND_TICKS (1000 * XS1_TIMER_KHZ)

#define TEST_TIMER_VALUES_SKEW_PLACED 2

#if (TEST_TIMER_VALUES_SKEW_PLACED==0)
    #define INFO "UNPLACED"
    #define COMBINABLE
    #define PLACED_0
    #define PLACED_1
#elif (TEST_TIMER_VALUES_SKEW_PLACED==1)
    #define INFO "PLACED"
    #define COMBINABLE [[combinable]]
    #define PLACED_0 on tile[0].core[0]:
    #define PLACED_1 on tile[0].core[1]:
#elif (TEST_TIMER_VALUES_SKEW_PLACED==2)
    #define INFO "PLACED_SAMECORE"
    #define COMBINABLE [[combinable]]
    #define PLACED_0 on tile[0].core[0]:
    #define PLACED_1 on tile[0].core[0]:
#else
    #error
#endif

COMBINABLE // or not!
void test_task (const unsigned task_id, const unsigned initial_delay_ticks) {

    timer     tmr;
    time32_t  event_time; // Name from XMOS lib_i2c, both time stamps are used
    debug_print ("%s from #%u\n", INFO, task_id);

    tmr :> event_time; // Showing something else than delay_milliseconds(ms) etc:
    select {case tmr when timerafter (event_time + initial_delay_ticks) :> void: break;}

    // ONLY IN BOTTOM LOG: tmr :> event_time; // Reset time is CRUCIAL!

    while (1) {
        select {
            case tmr when timerafter (event_time) :> time32_t now: {
                debug_print ("Event_time %08X :> now %08X, #%u lost %d ticks\n", event_time, now, task_id, now-event_time);
                time += ONE_SECOND_TICKS; // FUTURE TIMEOUT, same value as "time" above
            } break;
        }
    }
}

int main() {
    par {
        PLACED_0 /* or not! */ test_task (0, 0); // No initial delay
        PLACED_1 /* or not! */ test_task (1, ONE_SECOND_TICKS/2);
    }
    return 0;
}

Observe that a tick is 10 ns. So, we’d mostly lose 20 ns. So an error of 1 ms would creep up after 50000 counts here, 13.88.. hours if I’ve got the arithmetics right:

UNPLACED from #1
UNPLACED from #0
Event_time 02491CC5 :> now 02491CD9, #0 lost 20 ticks
Event_time 02491687 :> now 05440715, #1 lost 50000014 ticks
Event_time 083EF787 :> now 083EF789, #1 lost 2 ticks
Event_time 083EFDC5 :> now 083EFDC7, #0 lost 2 ticks
Event_time 0E34D887 :> now 0E34D889, #1 lost 2 ticks
Event_time 0E34DEC5 :> now 0E34DEC7, #0 lost 2 ticks

PLACED from #1
PLACED from #0
Event_time 02472CC3 :> now 02472CD6, #0 lost 19 ticks
Event_time 024726B0 :> now 0542173E, #1 lost 50000014 ticks
Event_time 083D07B0 :> now 083D07B2, #1 lost 2 ticks
Event_time 083D0DC3 :> now 083D0DC5, #0 lost 2 ticks
Event_time 0E32E8B0 :> now 0E32E8B2, #1 lost 2 ticks
Event_time 0E32EEC3 :> now 0E32EEC5, #0 lost 2 ticks
Event_time 1428C9B0 :> now 1428C9B2, #1 lost 2 ticks

Here is there is an in-swing with 2089 and then constant 1072 ricks lost. This also happens when the timer overflows after 42 seconds. Could this be some “intereference” between the schedulings? Observe that the two tasks share a joined, common select since they run on the same code:

PLACED_SAMECORE from #0
PLACED_SAMECORE from #1
Event_time 02405B6B :> now 053B539D, #0 lost 50001970 ticks
Event_time 024062CA :> now 053B666A, #1 lost 50004896 ticks
Event_time 08363C6B :> now 08363C75, #0 lost 10 ticks
Event_time 083643CA :> now 08364BF3, #1 lost 2089 ticks
Event_time 0E2C1D6B :> now 0E2C1D75, #0 lost 10 ticks
Event_time 0E2C24CA :> now 0E2C2CF3, #1 lost 2089 ticks
Event_time 1421FE6B :> now 1421FE75, #0 lost 10 ticks
Event_time 142205CA :> now 14220D7E, #1 lost 1972 ticks
Event_time 1A17DF6B :> now 1A17DF75, #0 lost 10 ticks
Event_time 1A17E6CA :> now 1A17EE7E, #1 lost 1972 ticks
Event_time 200DC06B :> now 200DC075, #0 lost 10 ticks
Event_time 200DC7CA :> now 200DCF7E, #1 lost 1972 ticks
..
Event_time FC97456B :> now FC974575, #0 lost 10 ticks
Event_time FC974CCA :> now FC97547E, #1 lost 1972 ticks
Event_time 028D266B :> now 028D2675, #0 lost 10 ticks
Event_time 028D2DCA :> now 028D35F3, #1 lost 2089 ticks
Event_time 0883076B :> now 08830775, #0 lost 10 ticks
Event_time 08830ECA :> now 088316F3, #1 lost 2089 ticks
Event_time 0E78E86B :> now 0E78E875, #0 lost 10 ticks
Event_time 0E78EFCA :> now 0E78F7F3, #1 lost 2089 ticks
Event_time 146EC96B :> now 146EC975, #0 lost 10 ticks
Event_time 146ED0CA :> now 146ED87E, #1 lost 1972 ticks

See what happens when I reset time tmr :> time;  just before the while loop. There is no swinging in:

PLACED_SAMECORE from #0
PLACED_SAMECORE from #1
Event_time 02AE25CB :> now 05A91DF6, #0 lost 50001963 ticks
Event_time 05A91DA5 :> now 05A930C3, #1 lost 4894 ticks
Event_time 08A406CB :> now 08A406D5, #0 lost 10 ticks
Event_time 0B9EFEA5 :> now 0B9EFEAF, #1 lost 10 ticks
Event_time 0E99E7CB :> now 0E99E7D5, #0 lost 10 ticks
Event_time 1194DFA5 :> now 1194DFAF, #1 lost 10 ticks
...
Event_time FA0A26A5 :> now FA0A26AF, #1 lost 10 ticks
Event_time FD050FCB :> now FD050FD5, #0 lost 10 ticks
Event_time 000007A5 :> now 000007AF, #1 lost 10 ticks
Event_time 02FAF0CB :> now 02FAF0D5, #0 lost 10 ticks
Event_time 05F5E8A5 :> now 05F5E8AF, #1 lost 10 ticks
Event_time 08F0D1CB :> now 08F0D1D5, #0 lost 10 ticks

Client-server and call-based notification, placement and chanend numbers

Also see

  • XCore: Number of chanends on different boards? (in note 098) There are 32 chanends per tile. This is not a library of chanends, it’s hw architecture and it’s in the instruction set; the number cannot be extended. Some XMOS processors have two tiles and 64 chanends (an overview of them is XMOS series of processors (also in note 098)). A standard chan takes two chanends. An interface also uses chanends, but the answer is 1 or 2 or 3! See below
  • XCore: Calculating number of chanends (in note 098) It’s difficult and the algorithm for sharing, communicating and synchronising between tasks that XMOS use probably is one of their competitive factors. Of course, this is very closely related to the intricacies of the architecture of the XCORE processors. So we just have to go by examples. See below
  • Using a chanend in both directions (in note 141) The code example there save you one chanend. But you must do the bookkeeping yourself!
  • The algorithm is complicated by the fact that “If a distributed task is connected to several tasks, they cannot safely change its state concurrently. In this case the compiler implicitly uses a lock to protect the state of the task.” (Note 141 [1]). I should have no examples of locks being used here, but I wouldn’t know for sure

Summary table

RolesInterfaceSemanticsPlacement#timers#chanends
client/serverstart call
[notification]
[[clears_notificaltion]]
server non-blocking
notifies client when ready
Auto (none)34
N clients/server--"--
All clients are fair!
--"----"--4
(1 less on xCORE-200)
1 added per client
7
3 added per client
call-basedcall with return paramsserver call blocks
until it returns
Auto (none)33
call-based--"----"--Same core21
call-based--"----"--Different cores33

The table above shows a summary of the resource usage of the code in the folds below.

A comment about client-server non-blocking and call-based blocking roles

I go through two different kind of architectures here

Client-server non-blocking roles

Observe that a system built with tasks that relate to each other as client-server only is also deadlock free.

The first architecture a type where the client(s) initiate data collection with start_collect_data and this call will not block, ie. it will not wait until data has been produced. In my case producing data is just and increment of data (with data++), which the server does in due time (in this case: immediately). When data has been produced the server sends a data_ready [[notification]] (no data) to the client, who waits in a select and has been allowed to do other things in the meantime. The client then picks up the data with with the get_data, which [[clears_notification]] for the atomic protection that the compiler builds in that period.

No select on state only

Observe the error message that XC gives us if we try with a state only select component: “error: input from a variable that is neither a channel, port nor timer” (error messages are listed in XC is C plus X). In other words: we cannot have a state only in a select case. I tried have tried this several times, and it never works: case (send_handle) :> void. I miss this, it’s in CSPm ((ready == is_ready) & SKIP, see FDR2 notes) and Promela (:: (A == true), see Promela).

It would have been nice to have. But then..

..I assume that XC is built like this very much on purpose. When all selects are events then there’s a different algorithm than if the scheduler would  have to go up another round in the while to see if there is a state only to satisfy, even recursively. I guess this would require advanced state analysis like the formal verification tools mentioned above do, and which an executable language typically wont’ do. And timing analysis would be difficult. This is the reason that I had to use timers in my examples. They look malplaced, but in real life it doesn’t make too much of a difference. Excepts it does eat a timer, a scarce resource!

Aside: Promela has a concept of runnable models, but that’s because of how it’s built: an executable model would really execute until it’s deadlocked or livelocked or whatever. It’s easy to think of it as code. I think we’re supposed to think of it as code. Besides, the SPIN tool builds a new tool (as new C code that’s compiled to a binary) that is the model verifier. On the other side, in CSPm and FDR4 (see here), there is the model only. No executable thinking with the model. FDR4 simply verifies the model. No execution of the model. But XC is am executable language. I guess that makes much of the difference.

Call-based blocking roles

This is like a Remote procedure call (RPC, see Remote procedure call), where data is returned from the server on the single call there is. If the server now needs a long time to fetct the data then the client should be ok about that, per design! It the client uses all its fecthed data every 1000 ms an and it would block 500 of those ms then it’s no problem! Per design any max response time by this client (if it’s a server for other task) is then allowed to be max 500 ms. If not acceptable, then use the client-server model. I have discussed blocking to a larger extent at Not so blocking after all.

Client-server interface

Client-server interface code

Observe that DEBUG_PRINT_TEST 1 or 0 does not change the number of cores, timers and chanends.

// See http://www.xcore.com/viewtopic.php?f=26&t=6729
#include <platform.h> // core
#include <stdio.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

typedef enum {false,true} bool;
typedef signed int time32_t;
typedef unsigned data_t;

typedef interface notify_if_t {

    void start_collect_data (void);

    [[notification]]
    slave void data_ready (void);

    [[clears_notification]]
    data_t get_data (void);

} notify_if_t;

[[combinable]]
void server_task (server notify_if_t i_notify) {
    timer     tmr;
    time32_t  time;
    bool      collectData = false;
    data_t    data = 0;

    tmr :> time;

    while (1) {
        select {
            case i_notify.start_collect_data (void) : {
                collectData = true;
                debug_print ("\n%s\n", "start");
                tmr :> time; // immediately
            } break;
            case (collectData == true) => tmr when timerafter (time) :> void : {
                collectData = false;
                data++; // This is supposed to take a while, that's why we used notification: to get it decoupled
                debug_print ("inc %u\n", data);
                i_notify.data_ready();
            } break;
            case i_notify.get_data (void) -> data_t return_Data : {
                debug_print ("sent %u\n", data);
                return_Data = data;
            } break;
        }
    }
}

[[combinable]]
void client_task (client notify_if_t i_notify) {
    timer     tmr;
    time32_t  time;
    bool      expect_notification = false;

    tmr :> time; // immediately

    while (1) {
        select {
            case (expect_notification == false) => tmr when timerafter (time) :> void : {
                i_notify.start_collect_data();
                expect_notification = true;
            } break;
            case (expect_notification == true) => i_notify.data_ready() : {
                data_t data = i_notify.get_data();
                debug_print ("got %u\n", data);
                expect_notification = false;
                time += XS1_TIMER_HZ; // 1 second
            } break;
        }
    }
}

int main() {
    /*
    Constraint check for tile[0]:
      Cores available:            8,   used:          2 .  OKAY
      Timers available:          10,   used:          3 .  OKAY
      Chanends available:        32,   used:          4 .  OKAY
      Memory available:       65536,   used:      10092 .  OKAY (4052 if DEBUG_PRINT_TEST 0)
        (Stack: 1080, Code: 8278, Data: 734)
    Constraints checks PASSED.
    */
    notify_if_t if_notify;
    par {
        server_task(if_notify);
        client_task(if_notify);
    }
    return 0;
}

N clients client-server interface code

You can set the number of clients. There is one timer and three chanends added per client. Plus, the clients are treated fair (open all folds in this note and search for “fair”). Also observe that the the XCORE-200 platform there is one less chanend. I assume this has to do with some added instruction or another use of lock instructions (doubtful). I have queried about this at XCore Exhange, see Num timers for startKIT and eXplorerKIT (27May2018). (The code also displays an issue when DO_PLACE == 1. I have reported this to XMOS (27May2018, Ticket 31198). The response is discussed in XC is C plus X [Replicated par and placements].)

#include <platform.h> // core
#include <stdio.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

typedef enum {false,true} bool;
typedef signed int time32_t;
typedef unsigned data_t;

typedef interface notify_if_t {

    [[guarded]]
    void start_collect_data (void);

    [[notification]]
    slave void data_ready (void);

    [[clears_notification]]
    data_t get_data (void);

} notify_if_t;

#define NUM_CLIENTS 2 // DO_PLACED 0: MAX 7. 1 timer and 3 chanends per client

[[combinable]]
void server_task (server notify_if_t i_notify[NUM_CLIENTS]) {
    timer     tmr;
    time32_t  time;
    bool      collectData = false;
    bool      session = false; // Necessary!
    data_t    data = 0;
    int       session_index_of_client;

    debug_print ("%s\n", "server_task");
    tmr :> time; // immediately

    while (1) {
        select {
            case (session == false) => i_notify[int index_of_client].start_collect_data (void) : {
                collectData = true;
                session = true;
                debug_print ("%d started\n", index_of_client);
                session_index_of_client = index_of_client;
                tmr :> time; // immediately
            } break;
            case (collectData == true) => tmr when timerafter (time) :> void : {
                collectData = false;
                data++; // This is supposed to take a while, that's why we used notification: to get it decoupled
                debug_print ("%d produced %u\n", session_index_of_client, data);
                i_notify[session_index_of_client].data_ready();
            } break;
            case i_notify[int index_of_client].get_data (void) -> data_t return_Data : {
                debug_print ("%d sent %u\n", index_of_client, data);
                session = false;
                return_Data = data;
            } break;
        }
    }
}

[[combinable]]
void client_task (client notify_if_t i_notify, const int index_of_client) {
    timer     tmr;
    time32_t  time;
    bool      expect_notification = false;
    unsigned  cnt = 0;
    debug_print ("client_task %d\n", index_of_client);

    tmr :> time; // immediately

    while (1) {
        select {
            case (expect_notification == false) => tmr when timerafter (time) :> void : {
                debug_print ("\n%d START\n", index_of_client);
                i_notify.start_collect_data();
                expect_notification = true;
            } break;
            case (expect_notification == true) => i_notify.data_ready() : {
                data_t data = i_notify.get_data();
                cnt++;
                debug_print ("%d GOT %u is #%u\n", index_of_client, data, cnt);
                expect_notification = false;
                time += XS1_TIMER_HZ; // 1 second
            } break;
        }
    }
}

int main() {
    #define DO_PLACED 0 // 1 simply does not work!

    notify_if_t if_notify[NUM_CLIENTS];
    #if (DO_PLACED == 0)
        /*
        STARTKIT:
        NUM_CLIENTS 1 Cores 2  Timers 3  Chanends  4  OKAY.
        NUM_CLIENTS 2 Cores 3  Timers 4  Chanends  7  OKAY.
        NUM_CLIENTS 3 Cores 4  Timers 5  Chanends 10  OKAY.
        NUM_CLIENTS 4 Cores 5  Timers 6  Chanends 13  OKAY.
        NUM_CLIENTS 5 Cores 6  Timers 7  Chanends 16  OKAY.
        NUM_CLIENTS 6 Cores 7  Timers 8  Chanends 19  OKAY.
        NUM_CLIENTS 7 Cores 8  Timers 9  Chanends 22  OKAY.

        XCORE-200-EXPLORER has one less timer (xTIMEcomposer 14.3.3):
        NUM_CLIENTS 1 Cores 2  Timers 2  Chanends  4  OKAY.
        NUM_CLIENTS 2 Cores 3  Timers 3  Chanends  7  OKAY.
        NUM_CLIENTS 3 Cores 4  Timers 4  Chanends 10  OKAY.
        NUM_CLIENTS 4 Cores 5  Timers 5  Chanends 13  OKAY.
        NUM_CLIENTS 5 Cores 6  Timers 6  Chanends 16  OKAY.
        NUM_CLIENTS 6 Cores 7  Timers 7  Chanends 19  OKAY.
        NUM_CLIENTS 7 Cores 8  Timers 8  Chanends 22  OKAY.
        */
        par {
            server_task(if_notify);
            par (size_t i = 0; i < NUM_CLIENTS; i++) {
                client_task (if_notify[i], i);
            }
        }
    #elif (DO_PLACED == 1)
        /*
        NUM_CLIENTS 1 Cores 1  Timers 2  Chanends 2  OKAY.  Constraints checks PASSED. BUT DOES NOT RUN!
        NUM_CLIENTS 2 Cores 1+ Timers 2+ Chanends 3+ MAYBE. PASSED WITH CAVEATS.       DOES NOT RUN
        NUM_CLIENTS 3 Cores 1+ Timers 2+ Chanends 4+ MAYBE. PASSED WITH CAVEATS.       DOES NOT RUN
        Constraint check for tile[0]:
          Cores available:            8,   used:          1+.  MAYBE
          Timers available:          10,   used:          2+.  MAYBE
          Chanends available:        32,   used:          4+.  MAYBE
          Memory available:       65536,   used:      12460+.  MAYBE
            (Stack: 1112+, Code: 10380, Data: 968)
        Constraints checks PASSED WITH CAVEATS.
        */
        par {
            on tile[0].core[0]: server_task(if_notify);
            par (size_t i = 0; i < NUM_CLIENTS; i++) {
                on tile[0].core[0]: client_task (if_notify[i], i);
            }
        }
    #elif ((DO_PLACED == 2) && (NUM_CLIENTS == 7))
        // XMOS code during Ticket 31198
        /*
        STARTKIT:
        NUM_CLIENTS 7 Cores 8  Timers 9  Chanends 22  OKAY.

        XCORE-200-EXPLORER has one less timer (xTIMEcomposer 14.3.3):
        NUM_CLIENTS 7 Cores 8  Timers 8  Chanends 22  OKAY.
        */
        par {
            on tile[0].core[0]: server_task(if_notify);
            // Using the par replicator index for core placement is unimplemented (14.3.3)
            on tile[0].core[1]: client_task (if_notify[0], 0);
            on tile[0].core[2]: client_task (if_notify[1], 1);
            on tile[0].core[3]: client_task (if_notify[2], 2);
            on tile[0].core[4]: client_task (if_notify[3], 3);
            on tile[0].core[5]: client_task (if_notify[4], 4);
            on tile[0].core[6]: client_task (if_notify[5], 5);
            on tile[0].core[7]: client_task (if_notify[6], 6);
        }
    #endif
    return 0;
}

N clients client-server interface log

This is the log with NUM_CLIENTS 7 and DO_PLACED 0. You see how 6 starts queuing up, then 0-5 before 6 is taken by the first “6 started”. Rather interesting:

6 START

0 START

1 START

2 START

3 START

4 START

5 START
6 started
6 produced 358
6 sent 358
6 GOT 358 is #52
0 started
0 produced 359
0 sent 359
0 GOT 359 is #52
1 started
1 produced 360
1 sent 360
1 GOT 360 is #52
2 started
2 produced 361
2 sent 361
2 GOT 361 is #52
3 started
3 produced 362
3 sent 362
3 GOT 362 is #52
4 started
4 produced 363
4 sent 363
4 GOT 363 is #52
5 started
5 produced 364
5 sent 364
5 GOT 364 is #52

Call-based interface

Call-based interface code

Observe that DEBUG_PRINT_TEST 1 or 0 does not change the number of cores, timers and chanends.

#include <platform.h> // core
#include <stdio.h>
#include <timer.h> // delay_milliseconds(200), XS1_TIMER_HZ etc

#define DEBUG_PRINT_TEST 1
#define debug_print(fmt, ...) do { if(DEBUG_PRINT_TEST) printf(fmt, __VA_ARGS__); } while (0)

typedef enum {false,true} bool;
typedef signed int time32_t;
typedef unsigned data_t;

typedef interface call_if_t {

    data_t collect_and_get_data (void);

} call_if_t;

[[combinable]]
void server_task (server call_if_t i_call) {
    data_t data = 0;

    while (1) {
        select {
            case i_call.collect_and_get_data (void) -> data_t return_Data : {
                data++; // This should take as little time as possible since it blocks
                return_Data = data;
                debug_print ("sent %u\n", data);
            } break;
        }
    }
}

[[combinable]]
void client_task (client call_if_t i_call) {
    timer     tmr;
    time32_t  time;

    tmr :> time; // immediately

    while (1) {
        select {
            case tmr when timerafter (time) :> void : {
                debug_print ("%s", "\nasked\n");
                data_t data = i_call.collect_and_get_data();
                debug_print ("got %u\n", data);
                time += XS1_TIMER_HZ; // 1 second
            } break;
        }
    }
}

int main() {
    #define DO_PLACED 2 // 1 is OPTIMAL

    call_if_t if_call;
    par {
    #if (DO_PLACED == 0)
        server_task(if_call);
        client_task(if_call);
        /* Constraint check for tile[0]:
          Cores available:            8,   used:          2 .  OKAY
          Timers available:          10,   used:          3 .  OKAY
          Chanends available:        32,   used:          3 .  OKAY
          Memory available:       65536,   used:       9816 .  OKAY (3840 if DEBUG_PRINT_TEST 0)
            (Stack: 1032, Code: 8066, Data: 718)
        Constraints checks PASSED. */
    #elif (DO_PLACED == 1)
        on tile[0].core[0]: server_task(if_call);
        on tile[0].core[0]: client_task(if_call);
        /* Constraint check for tile[0]:
          Cores available:            8,   used:          1 .  OKAY
          Timers available:          10,   used:          2 .  OKAY
          Chanends available:        32,   used:          1 .  OKAY
          Memory available:       65536,   used:       9760 .  OKAY (4128 if DEBUG_PRINT_TEST 0)
            (Stack: 656, Code: 8350, Data: 754)
        Constraints checks PASSED. */
    #elif (DO_PLACED == 2)
        on tile[0].core[0]: server_task(if_call);
        on tile[0].core[1]: client_task(if_call);
        /* Constraint check for tile[0]:
          Cores available:            8,   used:          2 .  OKAY
          Timers available:          10,   used:          3 .  OKAY
          Chanends available:        32,   used:          3 .  OKAY
          Memory available:       65536,   used:       9952 .  OKAY (3960 if DEBUG_PRINT_TEST 0)
            (Stack: 1048, Code: 8174, Data: 730)
        Constraints checks PASSED. */
    #endif
    }
    return 0;
}

Comments welcome

A 4-bit port and array of 1-bit ports match in a non-consistent manner

I came across this when I was going to get the lib_spi to run on a startKIT. The problem is with the CS-pins that are mapped onto an array of 1-bit ports in the original, but on the startKIT there is one CS and one EN mapped as first and second bit of a 4-bit port. When I disregarded the EN pin the code ran fine with the CS bit going low and high like it should.

But when I made a branch of the code and added an spi_master_2 (aside: full code here. This code takes care of both CS and EN bits, so it uses masks instead of 1-bit ports) I saw the situation. It’s like this (below).

The first bit of a 4-bit port is compatible with the first 1-bit port of the 1-bit port array. But there it correctly stops: the second bit of a 4-bit port is not compatible with the second 1-bit port of the 1-bit port array. From this follows that it should not have been allowed to send the 4-bit port instead of the array of 1-bit ports.

The type checker should, as far as I can see, not have allowed the compilation below. It even works for that first bit!

The semantics of p_ss[0] <: 1; with XS1_PORT_4C outputs 0001 binary but then it’s completely undefined, as with p_ss[1] <: 1; (but this latter is not compilable as I am not able to initialise p_ss for NUM_SPI_SLAVES 2 in the XS1_PORT_4C case. That’s fine! What is not fine it compiles and runs for NUM_SPI_SLAVES 1 in the XS1_PORT_4C case.

Defining PORT_PARAM_2 makes sense!

#include <platform.h>
#include <xs1.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <iso646.h>
#include <xccompat.h> // REFERENCE_PARAMs

// Example from lib_spi:
void spi_master (
        out port p_ss[num_slaves],
        static const size_t num_slaves)
{
    for(unsigned i=0;i<num_slaves;i++)
        p_ss[i] <: 1;
}

// #define XMOS_10848_PORT_PARAM_2

#ifdef XMOS_10848_PORT_PARAM_2
    #define NUM_SPI_SLAVES 2
    out port p_ss[NUM_SPI_SLAVES] = {XS1_PORT_1A, XS1_PORT_1B};
#else
    #define NUM_SPI_SLAVES 1
    out port p_ss[NUM_SPI_SLAVES] = {XS1_PORT_4C};
#endif

int main() {
    par {
        spi_master (p_ss, NUM_SPI_SLAVES);
    }
    return 0;
}

This also was posted on XCore Exchange at A 4-bit port and array of 1-bit ports match in a non-concistent manner (20Feb2018). There are some valuable comments there!

Port as parameter in interface call

This should be legal code, but the 14.3.2 compiler causes a run-time crash:

#include <platform.h>
#include <xs1.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <iso646.h>
#include <xccompat.h> // REFERENCE_PARAMs

typedef enum {false,true} bool;
typedef interface my_if {

    void x_crashes (out port p_spi_aux_in_call);
    void y (void);

} my_if;

[[combinable]]
void my_client (client my_if i_my, out port p_spi_aux_in_call) {
    timer    tmr;
    unsigned current_time;
    bool     now = true;

    tmr :> current_time;
    while (1) {
        select {
            case tmr when timerafter(current_time) :> void: {
                current_time += XS1_TIMER_HZ; // Once per second
                if (now) {
                    i_my.y (); // This first run always succeeds
                } else {
                    i_my.x_crashes (p_spi_aux_in_call); 
                    // Run-time crash: ET_ILLEGAL_RESOURCE
                }
                now = not now;
            } break;
        }
    }
}

[[combinable]]
void my_server (server my_if i_my, out port p_spi_aux_in_init) {
    while (1) {
        select {
            case i_my.x_crashes (out port p_spi_aux_in_call): {
                printf("i_my.x crashes in line above\n"); // So never gets here
                break;
            }
            case i_my.y (void): {
                printf("i_my.y here\n");
                break;
            }
        }
    }
}

out port p_spi_aux_in_init = XS1_PORT_4C;
out port p_spi_aux_in_call = XS1_PORT_4D;

int main() {
    my_if i_my;
    par {
        on tile[0].core[0]: my_server (i_my, p_spi_aux_in_init);
        on tile[0].core[1]: my_client (i_my, p_spi_aux_in_call);
    }
    return 0;
}

There is more in XCore Exchange Crashing on port as parameter in interface call. Firstly it should not cause a run-time crash to send the port along in an interface call and secondly there is another problem about stack calculation that I have only mentioned in that post.

Code by others ported to XC by me

Adafruit SSD1306 on XMOS board

I have been asked about the code that makes it possible to use the SSD1306 on XMOS boards.

The Adafruit monochrome 128×32 I2C OLED graphic display is their product id 931 (here), containing module UG-2832HSWEG02 with chip SSD1306 from Univision Technology Inc. You will find Univision’s data sheets in the said url as well.

I have wrapped my code into a zip, but it is not at all a “runnable” test program. I had that in my early xTIMEcomposer version control (Git) system, but I didn’t want to publish it here because my latest code is the greatest. Instead I have just pulled some files out of my aquarium project “as is”. Observe that the broader part of the SSD1306 code have been written by Adafruit (some based on Univision’s pseudo code), and then modified by me. I probably should have made a branch on their GitHub node (here). The reason I haven’t done it is because I think it too much different. Maybe this is a recipe of how not to do things, but I’ll give it a try and publish it here. The code works and has been taking care of my fishes for some time now. Disclaimer: It may be easier to port the Cpp and C files more directly (as I explained in the chapter above) but I didn’t do that. So here are .xc files for .cpp etc.

Observe that I have used the older XMOS code [module_i2c_master] and not the newer [lib_i2c], which is more advanced.

Here’s the code, which I guarantee nothing about and have no responsibility for: here (rev3). It’s rev3 (5Feb2018) after some fixes I got from a guy who actually got his 128×64 display up and running based on this code. I had my 128×32 display up. Only a single display, the code is not general for a number of displays, too many defines of sizes etc. Should be easy to fix though. This guy has done a great job, because it’s not a “runnable” file set I have supplied, he did have to make some small include files himself, end remove some of the aquarium code. Should you find out that I have not included something very important then please mail me. Also, should you end up with a runnable test system for f.ex. the startKIT, maybe we should join forces and do something on GitHub? This guy did, but it’s now in a product proper, not to be published.

XMOS code comments and errata

lib_spi

In https://www.xmos.com/support/libraries/lib_spi 3.0.2 User Guide (spi.pdf) the code example in chapter 2.1 should be like this (fixes in red):

// My comment:              =  OBS ports, see comment below
in  buffered port:32 p_miso =  XS1_PORT_1A;  // was out
out port p_ss[1]            = {XS1_PORT_1B};
out buffered port:32 p_sclk =  XS1_PORT_1C;  // was 22
out buffered port:32 p_mosi =  XS1_PORT_1D;
clock clk_spi               =  XS1_CLKBLK_1;
int main(void) {
  spi_master_if i_spi[1];
  par {
    spi_master(i_spi, 1, p_sclk, p_mosi, p_miso , p_ss, 1, clk_spi);
    my_application(i_spi[0]);
  }
return 0; }


SPI_MODE_0 and SPI_MODE_1 assumed swapped in lib_spi ::

By careful examination to the best of my ability I have come to the assumption that lib_spi 3.0.2 has got SPI_MODE_0 and SPI_MODE_1 swapped all over. (If I am wrong nothing is better!) In the documentation and in the code. I have posted this to XCore Exchange, see SPI_MODE_0 and SPI_MODE_1 assumed swapped in lib_spi (22Feb2018).

The headings of SPI modes show the two first wrongly. The correct is in green (and Mode 0 is SPI_MODE_0 etc.):

  • 1.1.1 Mode 0 – CPOL: 0 CPHA 1” → Mode 1 in heading and figure description
  • 1.1.2 Mode 1 – CPOL: 0 CPHA 0” → Mode 0 i in heading and figure description

I have run the code with all four SPI modes and scoped the result, and then compared that to the Wikipedia SPI mode curve example picture here. My document is here.

Some examples from the code. The typedef in spi.h also has its comments wrong:

/** This type indicates what mode an SPI component should use */
typedef enum spi_mode_t {
  SPI_MODE_0, /**< SPI Mode 0 - Polarity = 0, Clock Edge = 1 CORRECT: Clock Edge = 0*/
  SPI_MODE_1, /**< SPI Mode 1 - Polarity = 0, Clock Edge = 0 CORRECT: Clock Edge = 1*/
  SPI_MODE_2, /**< SPI Mode 2 - Polarity = 1, Clock Edge = 0 */
  SPI_MODE_3, /**< SPI Mode 3 - Polarity = 1, Clock Edge = 1 */
} spi_mode_t;

And here is another example. I assume the code has to be modified to get it correct. For now I can just swap the SPI_MODE_0 and SPI_MODE_1</tt and use one for the other.

static void get_mode_bits(spi_mode_t mode, unsigned &cpol, unsigned &cpha){
    switch(mode){
        case SPI_MODE_0:cpol = 0; cpha= 1; break;
        case SPI_MODE_1:cpol = 0; cpha= 0; break;
        case SPI_MODE_2:cpol = 1; cpha= 0; break;
        case SPI_MODE_3:cpol = 1; cpha= 1; break;
    }
}


MSB is output first ::

In <spi.h> it says, as comment to transfer8 and transfer32 wrongly that:

The data will be transmitted least-significant bit first. (sic)

Here is the code, and it does the opposite of that comment. However, this also corresponds to the correct figures in spi.pdf: most-significant bit first (MSBFIRST):

// uint8_t data is the data to be output by transfer8_sync_zero_clkblk
for(unsigned i=0;i<8;i++){ 
    // Code omitted 
    if(!isnull(mosi)){ 
        partout_timed(mosi, 1, data>>7, time); 
            data<=1;
    }
    // Code omitted
}

The first bit to output is data>>7 which is BIT7 or MSB. Then BIT6 is shifted left into BIT7 and output next. There is no doubt: MSB first!

More

I’ll be back with the ports used on the startKIT breakout board (here) and on my XCORE-200 eXplorerKIT breakout board (here and here).

Leave a Reply

Your email address will not be published. Required fields are marked *

*

* Copy This Password *

* Type Or Paste Password Here *

17,542 Spam Comments Blocked so far by Spam Free Wordpress

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>