Some questions about SDL

Contents

1 Intro
2 Questions
3 Some notes on SDL runtimes
4 References
5 Comments
- 5.1 Rolv Bræk

This page is in group Technology.
History

3jan2013 initial, without comments
14Mar2013: Comments #1. Rolv Bræk
2July2013: [1.B] and [12] are new
20Jan2014, inserted [13]
28Sept2016, inserted Notes on SDL runtimes
11Oct2023, Telox mentioned

Intro

I hope that somebody from the SDL (Specification and Description Language) community [1],[3] will find these points interesting to reflect on. If you send me a mail I will of course publish them as they are. Alternatively, submit a comment here. I would also take comments on any aspects of this blog note, like just a general comment. If you want to be anonymous, fine. I have tried to find challenging aspects. I do mean that any technology has challenging sides, this I have tried to discuss in my publications and blogs over the years. I would not reply to the comments, since the questions are my final words here.

The idea behind my most recent paper (about the «XCHAN») was to merge synchronous and asynchronous methodologies [11]. XCHAN is meant to have the virtues of both.

Questions

I mostly take quotes from [2] (Bræk and Haugen) and try to make questions from those. But [3] (Ellsberger, Hogrefe, Sarma) is also a good reference.

Disclaimers: My wording here may not correspond fully to SDL idiomatic jargon. Also, I am not updated on how SDL has developed since 1993(?!) It dates its history all the way back to 1976, as a recommended methodology, by CCITT, aimed at the telecom industry. However, do observe that [1.B] points to the fresh SDL-2010 document E 37558 posted by ITU on 2012-08-24. Also, have a look at [12], Appendix III «Evolution of the Specification and Description Language»: SDL-88, SDL-92, SDL-2000 and SDL-2010.

«To be general, one must deal with the buffer overflow problem by delaying the producer when the buffer reaches its maximum capacity. In consequence, output from an SDL process may have to be delayed until the receiving buffer is ready. This deviation from the SDL semantics can hardly be avoided in a finite implementation. Care is needed to reduce to a minimum the practical problems this may cause» (page 233 of [2])

Q1. How would this be handled in a safety critical system?
Q2. How may fast producers and slow consumers on a general basis be solved?

«Hence, save may be implemented by adding a save queue to the data of each process and by moving input from this queue to the front of the normal input queue at the end of each transition to a new state. Since this may be rather time consuming, save should be avoided in time-critical applications» (page 285 of [2])

Q3. How is save avoided?

Since an SDL process is not able to switch off a possible input message, it must receive what it gets. There is something called an «enabling condition» ([8] 6.6 page 185), but this seems to equal just not listening on the input in a certain state. And it’s not like the matching mechanism in Erlang [9], which analyses the queue. As seen from the processs this is the same as a guarded choice in CSP [10], but that has the important additional semantics that it may also block a sender (on first attempt to send if asynchronous or unbuffered or N’th attempt if the channel is asynchronous or N-buffered (until full, then it blocks)). In SDL, if a message is not acceptable in the present state, it’s filled into the save queue (above). This means that SDL does not have WYSIWYG semantics [5]. This means that in SDL any one process must know how all the other processes that it receives input from (or communicates with) are coded to code a process. The Message Sequence (protocol) is not enough, what you see (in here) is not what you get (because you have to know more than the protocol: when can this message happen?). One cannot say that an event (message) cannot happen in this state (but one can save it). Also, one cannot hinder a sender to change state (again and again) after it has sent a message to this process. So a possible session between processes becomes quite complex to code, since any one (not wanted) message can come in between.

Q4. How does this influence SDL’s ability to make shelf software (libraries)?
Q5. How is this viewed for a safety critical system?

«The act of aligning the operations of different concurrent processes in relation to each other is generally called synchronisation. Synchronisation is necessary not only to achieve correctness in communication, but also to control access to shared resources in the physical system.

In SDL, synchronisation is achieved by means of signal queues of processes and channels.
Consider two SDL processes that communicate. The sending process may send a signal at any time because it will be buffered in the input port of the receiving process. The receiving process may then consume the signal at a later time.
This is a buffered communication scheme in which the sender may produce infinitely many signals without waiting for the receiver to consume them. It is often referred to as asynchronous communication» (page 232 of [2])

Since the buffer at the «input port» is infinite it will never overflow and «FSM send» (and «forget» as in «send and forget») will not have to be checked for overflow. Book [8] (3.8.1 page 62) specifies that «there is an input queue associated with each process instance». Therefore a global buffer pool for all messages is an equivalent implementation if it also is assumed to be infinite. Q1 and Q2 show that this semantics is not implementable in the real world. Again, [8] does not mention «overflow» or «buffer», at least not in the Index, but I have browsed for them (or the like) in the, and have not been able to find.

Q6. Still the infinite thinking is often kept, also for implementations. How is this viewed in a safety critical context?

SDL processes have internal state. Every input message is decoded in the context of present state. So, SDL processes are «communicating state machines». Holzmann [4] shows that such communicating state machines can communicate asynchronously or synchronously. Since global state advances by one state per transition anyhow it does not matter which machine of all that are executable that take the next state transition.

Q7. In my experience I see many people with background in SDL who seem to claim that asynchronous communication is needed. Is this a general impression? If so, why?

The Ada Ravenscar Profile (via [6]) defines a profile for safety critical systems. The Ada select mechanism uses queues to store the entry calls into the rendezvous. This introduces a «noncontrolled» nondeterminism. CSP describes «internal choice» as its «controlled» nondeterministic choice, where the choice taken makes no difference. The noncontrolled nondeterminism makes it not possible to analyse timing behaviour [7]. This is the reason why Ada needed a profile, which prohibits use of selects.

Q8. Is there any such safety-critical profile for SDL?

Finally

QX. Are any of the questions here unfair to SDL?
QY. Do any of the questions reveal a basic misunderstanding or even misconception of SDL on my behalf?
QZ. Any question I have missed?

Some notes on SDL runtimes

At a get-together (in Sept 2016) named «SISU 20 år» (LinkedIn) at the Department of Informatics at the University of Oslo (Institutt for Informatikk (Ifi), Universitet i Oslo) we discussed what has happened over the last about 20 years after the «SISU» project was closed around 1997. SISU was a grandeous project of The Research Council of Norway (Forskningsrådet). It went on from 1988 to 1997 and was administered by ITUF (IT-industriens Uviklingsforum), with a huge budget of about 40 mill NOK at the time. Autronica participated to some extent, also in some of the «SISU Forum» that were arranged. About 15 companies participated.

At the meeting we touched the SDL runtime («SDL RTOS», kjøresystem) that had been developed then, and it was noted that this seemed to be the prevalent scheduler even now – for the few users there seem to be. Here’s a short summary. I have also added some matters I tried to pick up in a private conversation afterwards.

Disclaimer: I may have got some of the details not correct. I cannot find any trace of this on the net, that’s why I’m adding it here. It should basically be open source(?) An authoritative link would be much better than this fragmentary list:

A company called Garex at the time (now Indra Navia) developed a run-time called TST in the SISU project.
* Update 11Oct2023: Stein Erik Ellevseth tells in a mail to me that this is not correct. TST, and Telox SDL Tools were both developed by Telox in the 80s. Telox also were active in the SISU project
* Telox was a Nowegian company, I think only now seen as the some ancestry of Kantega. I don’t think it has any relation to telox.com
TST seemed to have been based on an earlier runtime developed by ELAB (or update, see above: DELAB)
A message queue is a linked list pointing to message spaces that are taken out of a single message pool
- I think this means that data is moved into it at sending and out of it at reception. Two memcpy
- Provided it’s not a pointer that’s sent
- The buffer pool may become full with no more space for new messages, overflow
- (A CSP runtime with channels, buffered or not, will never overflow)
There is a message queue into each group of SDL tasks
- I think a guy mentioned that there may be additional queues if there are more priorities
There is a timed-out queue into each SDL FSM task (or group?). This queue is always checked by the scheduler before the message queue. It’s prioritised
- I asked about timers that were cancelled after they had reached the timed-out message queue. In my experience it’s easy to remove a timer in the list-of-next-timouts while it’s still in that list. (We, in our runtime, insert the timed-out message into the common message queue. A remove of a timeout when it’s in that queue is not done. So, in the CHAN_CSP runtime that I wrote (on top of an internal SDL runtime) I had to tag each timer and later timed-out message so that I could stop it from entering the process. See my From message queue to ready tqueue from «First ERCIM Workshop on Software-Intensive Dependable Embedded Systems» in 2005.) The solution in TST assures that once a timer is a timed out and in the timed-out message queue the process will be scheduled. Therefore a timed-out timer message cannot enter the SDL FSM task
- But what about this : An FSM has ordered more than one timer and both have timed out and the FSM is scheduled because of this before any other message. The second is then cancelled in that scheduling, but it’s still in the timed out queue. To solve this the TST probably also cleans up in that queue
- How, with this solution, they solve the fact that a not cancelled timer’s timed-out message may arrive prioritised before the real message also present, I am not sure. If the FSM waits for a message or a timeout if it didn’t arrive and the message arrives first, but before the scheduler comes around to scheduling the process then (after the message) the timer is timed out. In the FSM then the message would still arrive after the timeout. So the FSM has to handle that, while our FSMs have to handle a cancelled timed-out message arriving
- (A CSP scheduler or process will not see any of these two problems)
..

References

SDL = Specification and Description Language. http://en.wikipedia.org/wiki/Specification_and_Description_Language
[1.B] Overview of documents at http://www.itu.int/rec/T-REC-Z.100/en
Engineering Real Time Systems. An object-oriented methodology using SDL. Rolv Bræk & Øystein Haugen. Prentice Hall 1993. ISBN 0-13-034448-6
SDL Forum: http://www.sdl-forum.org/
Design and Validation of Computer Protocols. Gerard J. Holzmann. Prentice Hall, 1991. ISBN 0-13-539925-4. (This book has newer releases called «Spin Model Checker», by the same author). See pages starting at 167. Available on the net at http://spinroot.com/spin/Doc/Book91_PDF/x20v_1991.pdf
http://web.archive.org/web/19991013044050/http:/www.cs.bris.ac.uk/~alan/Java/ieeelet.html (wait a while, the Internet Archive needs time to find it)
Channels and rendezvous vs. safety-critical systems. Øyvind Teig. See my note 035
Nondeterminism. Øyvind Teig. See https://www.teigfam.net/oyvind/home/049-nondeterminism/
«SDL: formal object-oriented language for communicating systems», Jan Ellsberger, Dieter Hogrefe, Amardeo Sarma. Prentice Hall, 1997. ISBN 0-13-632886-5
Erlang programming language, see http://en.wikipedia.org/wiki/Erlang_(programming_language). There is an interesting blog about Erlang in which I participated in an ongoing discussion with the author at http://jlouisramblings.blogspot.no/2013/01/how-erlang-does-scheduling.html
CSP Communicating Sequential Processes, see http://en.wikipedia.org/wiki/Communicating_sequential_processes
«XCHANs: Notes on a New Channel Type», by Øyvind Teig, see http://www.teigfam.net/oyvind/pub/pub_details.html#XCHAN
«ITU-T Z.100 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (12/2011). SERIES Z: LANGUAGES AND GENERAL SOFTWARE ASPECTS FOR TELECOMMUNICATION SYSTEMS.» Formal description techniques (FDT) – Specification and Description Language (SDL). Overview of SDL-2010.
Read at https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-Z.100-201112-I!!PDF-E&type=items
«Tool support for the rapid composition, analysis and implementation of reactive services» by Frank Alexander Kraemer, Vidar Slåtten, Peter Herrmann, in The Journal of Systems and Software 82 (2009) 2068–2080. See http://www.item.ntnu.no/people/personalpages/fac/kraemer/publications/krshjss09

Comments

As mentioned at the top of this blog, any comments by anybody will not be responded to by me. But you may respond to the last comment here! Send me a mail and I will of course publish them as they are. Alternatively, submit comments here. I would certainly welcome comments that I would not agree in, the more you think my fingers would be itching to respond, the better!

Rolv Bræk

I have been so fortunate as to have NTNU professor Rolv Bræk (one of the authors of [2]) to fill in a specialist’s response to my questions. (14Mar2013)

Q1. How would this be handled in a safety critical system?

«This is not problem particular to SDL, it is a problem in any system where messages may be produced faster than they are consumed, let us call it overload. Whenever there is overload a message buffer will grow. When there is underload it will shrink. A system must therefore be designed with relatively short periods of overload (bursts) and relatively longer periods of underload. Otherwise it will eventually overflow. A system with statistical load variations must be dimensioned with sufficient buffers to handle peaks of overload. Short periods of buffer overflow during overload may then well be handled by delaying the producer, or by dropping messages. Where this is not acceptable one must ensure through careful design that the buffer never will overflow.»

Q2. How may fast producers and slow consumers on a general basis be solved?

«Sometimes this can be solved at the design level by applying acknowledgements that provide back pressure and prevents overload. This does not help if time is critical, as the acknowledgments will cause delays. In such cases one must somehow speed up consumption, by using faster processors for example, or reduce the load.»

Q3. How is save avoided?

«Save is avoided by treating the saved signals explicitly in stead of saving them. However, this may cause the process graph to grow considerably. The main use of save is when a process needs to postpone inputs from one source while handling inputs from another source. This can sometimes be achieved using priority mechanisms in the runtime system, that give priority to internal signals over external signals. But this is formally done outside SDL.»

Q4. How does this influence SDL’s ability to make shelf software (libraries)?

«I think you are right that SDL does not have what you call WYSIWYG semantics. The interface definitions in SDL are static as most other interface definitions. They are two way however.

To learn the interface behaviour one must look at the behaviour defined in the process graph, which comprises the full behaviour. There is a lot of work on interface behaviour in the literature. In the textbook on engineering real time systems (that you refer to) the principle of deriving interface behaviours from process behaviour is explained in the chapter on validation. This work has been carried further in the PhD work of Jacqueline Floch. The relationship between process behaviours and interface behaviours is that of a projection where all interactions on other interfaces are hidden. Thus, interface behaviour is derived from the process behaviour and used to understand how the process works without looking inside. It is also used for simplified validation of links connecting two interfaces and for consistency checks of process behaviours. Interestingly the consistency checks finds pathologies in the design that would cause interworking problems in any environment. There is no tool support for this as yet. A similar idea is used in the reactive building blocks supported by the Arctis toolset (see BitReactive [and added by Teig 20Jan2014: [13]]). Here blocks have external state machines, called ESMs, that defines their external behaviour without going into any internal details. This allows reuse without looking into the reactive blocks, and is probably closer to what you call WYSIWYG?

It is not correct that messages that are not specified are saved in SDL. The default is to discard such messages. To save a message it has to be explicitly saved in every state where it shall be saved.»

Q5. How is this viewed for a safety critical system?

«For safety critical systems it is essential to prove (or at least build confidence) that so-called safety properties are satisfied by a design. In principle this is done by expressing the safety properties formally in the property language of a model checker, and then running the model checker to determine if they are satisfied or not. This demands that the design is expressed in a formal way that can be input to a model checker. Some SDL tools have built in model checkers, suitable for this. The main problem, is the exploding state space that often make model checking impractical. Interface projections as outlined above help to reduce the state space by hiding the inner details of processes, thus making model checking more scalable. The same applies to the ESMs in Arctis which enable a compositional analysis combined with automatic code generation.

To succeed with safety critical systems one should have an integrated toolset where the safety properties are proved and the implementation derived from the same models.»

Q6. Still the infinite thinking is often kept, also for implementations. How is this viewed in a safety critical context?

«Answered above I think.»

Q7. In my experience I see many people with background in SDL who seem to claim that asynchronous communication is needed. Is this a general impression? If so, why?

«I normally favour asynchronous communication, I must admit. I do not say it is necessary in all situations, mind you. There are cases where buffering is not needed and synchronous communication is better. Even when implementing SDL one can drop buffering and map SDL signals to synchronous method calls (or procedure calls) in given cases, as is explained in the textbook by Haugen and myself. However, this puts restrictions on the communication patterns and event timing in the SDL system. Using asynchronous communication is a simple way to remove the structural constraints, and (in most cases) also to isolate the time critical parts from the less time critical parts. For these reasons I advice my students to go for asynchronous communication as default, unless they are certain that method calls will be sufficient. Since many students (and I think many seasoned programmers too) believe that messaging entails a lot of overhead I try to explain that this is not normally the case. Remote method invocation (or RPC) for once entails much more than simple method calls and may involve scheduling delays and messaging for network transport. One may argue that messaging has a closer fit with distributed communication and thereby simplifies distribution transparency.

Synchronous communication will in some cases entail more waiting (on receivers to be ready) than asynchronous communication, and therefore more delays. Consider two processes a producer and a consumer. In the buffered asynchronous case the produces may send a number of messages (n) in a row without any waiting provided there is available space in the buffer. In the un-buffred synchronous case, both processes are simultaneously taking part in the interaction, which means that a message is not sent until the consumer is ready to receive it. If the consumer is busy, the message has to be delayed until the consumer is ready. It is this delay (dk) I have in mind. If the consumer always is ready the delay dk will be zero, but this puts constraints on the relative timing and execution speeds that may be relaxed in the asynchronous case.

If context switching between the two processes is needed, one will experience two context switches per message using synchronous communication. In the asynchronous case the number of context switches may be considerably reduced if the producer is allowed to produce many messages before the consumer is activated.

In my experience from many product developments, I have never experienced a situation where buffered messaging proved to be too inefficient all things considered. I have however, seen cases where synchronous procedure calls proved to be a show stopper for the realisation of new services due to the inherent restrictions on communication structures and timing.

I do not think the same restrictions would apply to the CSP communication case.»

Q8. Is there any such safety-critical profile for SDL?

«Not to my knowledge. There is a UML profile (MARTE) addressing real time and performance. Maybe there is something associated with SysML?»

QX. Are any of the questions here unfair to SDL?
QY. Do any of the questions reveal a basic misunderstanding or even misconception of SDL on my behalf?
QZ. Any question I have missed?

«I have pointed out the misunderstandings I could see. I see no unfair questions! It is not a matter of ‘fairness’, but realizing fundamental problems and finding solutions. In my mind SDL has many limitations, but still provides something very useful. Personally, I think the most important concepts are state machines and asynchronous communication. SDL is just a language for this.»

Commented by Rolv Bræk (Braek)