CANopen slave crashes on "Start Remote Node" command from master, SYNC related

2022-07-06
2022-07-08
  • sean-barton - 2022-07-06

    PROBLEM:
    I am transmitting sensor data from a CANopen slave to a CANopen master using Acyclic - synchronous PDOs to limit the amount of traffic due to the constant changing nature of the sensor data. I am experiencing problems with SYNC (see below) and am wondering if it would be better to change the transmission type of the PDOs from acyclic to asynchronous (with an inhibit time?)

    I am not very familiar with CANopen, but it has been fairly straight forward to setup until now.

    SETUP:
    I have a CANopen master and a CANopen slave device setup on a CANbus on the same network with the same baud rate (250 kbit/s). These are the only two devices on this network.

    The master is using a CANopen manager and has node-id = 1; the slave is a CANopen device with node-id = 2.

    The master has SYNC producing enabled on the CANopen manager.

    I have a mixture of PDOs on the slave setup as Asynchronous - device-profile-specific (Type 255), and Acyclic - synchronous (Type 0). I have exported the EDS file and have imported this EDS file into the master.

    OBSERVATION:
    I have a PCAN usb to view the CAN traffic: When I restart both master and slave, I can see the master sends out 8102 (reset node 2) as a broadcast message. At this time the slave shows status as 7F (127 - preoperational) and master as 05 (operational). Eventually, after several seconds, the master broadcasts 0102 (start remote node 2), and the slave responds with all PDO data and changes status to 05 (operational). At this time the slave stops operating - the status LEDs on the slave go from green to red and the traffic from the slave stops (the cycle time and counts no longer change.)

    If I disable the SYNC producing setting in the master CANopen manager, the above observation is the same except only Asynchronous PDO data is showing on the CANbus, and the slave continues to operate.

     
  • DavidBo - 2022-07-06

    How many bits do you use for PDO and what is the "Cycle Period" and "Window Length"
    Does the slave expect a "Heart beat" or "Node guarding"?

     
    • sean-barton - 2022-07-06

      All PDOs are 64 bit. The cycle period is 100000 microseconds with a window of 120000 microseconds.

      The slave has, under guarding, nodeguarding disabled and heartbeat producing enabled with a producer time of 200ms. The heartbeat consuming shows "Node-ID of Guarded Node" = 1 and "Consumer time (ms)" = 300 and this line is enabled.

      The master has, under guarding, heartbeat producing enabled with a node-id = 1 and producer time of 200ms.

      Thank you for responding.

       
  • DavidBo - 2022-07-06

    How many PDOs do you have?
    The Cycle period has to be larger than the Window length. If I understand it correctly the window is the period of time where the PDOs are transmitted and received.

     
    • sean-barton - 2022-07-06

      There are 63 receive PDOs and 43 transmit PDOs - 106 in total. I didn't know about the window length being smaller, the default was 1000 for cycle time and 1200 for window (microseconds) - I just increased both by the same factor.

      I tried exchanging the values for cycle period and window length (microseconds) to 120000 and 100000 respectively, and enabling the SYNC but the same problem occurs.

      I have a bus load analyzer which shows the bus load no higher than 18% on startup but 0.2% on average so the bus does not appear to be overloaded.

       
  • DavidBo - 2022-07-07

    "If I disable the SYNC producing setting in the master CANopen manager, the above observation is the same except only Asynchronous PDO data is showing on the CANbus, and the slave continues to operate."
    Does that mean that without SYNC, the led on the slave remains green.
    What is your system (Raspberry Pi or something else?)

     
    • sean-barton - 2022-07-07

      Yes, that is correct. without SYNC the slave LEDs remain green and operate as normal. The system is using two mobile industrial ECUs of the same manufacture.

       
  • DavidBo - 2022-07-07

    Try to disable heart beat producing on master, but enable SYNC with your new timing settings
    Please disable heart beat on the slave too

     

    Last edit: DavidBo 2022-07-07
    • sean-barton - 2022-07-07

      I disabled both heartbeat producing on master and slave with the new SYNC settings. On the CANbus I observed the reset message from the master but the slave does not report a pre-operational state, only 00. When a "Start Remote Node" message is issued from the master, the slave stops working as before, with the red LEDs showing and no activity from the slave on the CANbus.

      After the above observation, I also noticed heartbeat consuming on the slave was enabled. I disabled this setting but the result was the same, the slave ECU stops responding on start message from the master.

      I should note that the ECUs are SIL2 but I have not implemented any code in the Safety PLC side. I'm not sure if this matters.

       
  • DavidBo - 2022-07-07

    I don't know SIL2 but since it has something to do with safety one could imagine that heart beat might be a requirement. If you enable the "heat beat" let the time for "heart beat" producing be a multiple of the the SYNC period.

     
    • sean-barton - 2022-07-07

      I changed the heartbeat time from its original 200ms to 240ms as a multiple of the SYNC period of 120ms. At first it seemed to work, so I disconnected to change a PDO I have running as Asynchronous to Acyclic which also has active data transferring on it. I reconnected, and it appeared to work but after about 15 seconds the slave crashed again. Usually the connection stops immediately but this time it was working and I could verify that the data was transferring to the other device.

      That was on the first few connections, but now the slave has gone back to crashing almost immediately every time a connection is made. Since the first successful connection, I've only played with the timings in the hopes of getting the connection back - even putting everything back to when the first successful connection was made but the slave still crashes.

       
  • DavidBo - 2022-07-07

    In the "window" all PDO is transmitted and received. You have 100ms and I think you need 50ms so that should be alright.
    But hereafter you have only 20ms to SDOs and I suppose "heart beat"s (But I don't know) You have 3 heart beats one from each slave and one from the master maybe 20ms is not enough. What about a period of 160ms and heart beats of 320ms. What happens if you power off/on for the system. CODESYS has a log for each slave what does these say?
    What is your application cycle (not SYNC cycle)?

     
    • sean-barton - 2022-07-08

      I made the changes you suggested and I've noticed the slave can running several minutes before crashing.

      In most cases, I have been powering off/on both ECUs to test on startup.

      There are typically errors due to peripheral status and items that havn't affected the ECU in the past but because you pointed out the log, I ran through it more thoroughly. While watching the log during the start process, I notice four errors that occur after the slave crashes. They seem to show some sort of "fatal" status (not exceptions) but it's not very clear. I'll dive into these errors more deeply.

      The application and CAN cycles are on different tasks. The CAN task average is 8.5ms with a maximum cycle time of 16ms. The task interval is set to 100ms. There is no jitter on either task.

       

Log in to post a comment.