Api save records stops working untill restart

I am facing a issue where Api post record stops working untill I restart the container. This is completly random and all get api works. And I dont find any logs either.

The api request does not complete at all. I tried to switch on all the logs but it hangs the whole system.

So any post request fails, even the promt post request fails…Screenshot 2021-06-03 161231

  • What version are you running? 2021.3.5 is the current latest

  • is this happening on the CRM/Service Solution namespace or on a custom namespace?

  • have you done any custom workflows/automation scripts that are executed for records?

  • What do you see in the server container logs if you add this to the .env file?


If you are able to narrow down the issue and provide a consistent step-by-step reproduction, that would be amazing.

I am running 2021.3.5

Custom Namespace

Yes we have workflows, but when the issues occures workflows doesnt even get triggred and I have not found any errors in sessions.

I have this switched on and dont see any errors.

The issue is random and right now I dont have the steps to reproduce it.

I am to check the code that send the post request. Maybe thats requests post body is screwed up and that why corteza`s service gets unresponsive due to some corner case in the data validation. I will update once I have somthing

@tjerman Updated to 2021.3.6.

The issue is still there. I have debug logs enabled but logs doesnt have much in them.
Maybe the workflows are hanging as I can seen lots of sessions that has started but did not complete or failled.

The workflow themselves are fine as I have tested them multiple times.
If the Automation/Workflow service hangs can type of issue happen?
The reason I think that the automatiom services hang because when this happens the prompts request fails.

is there a way to restart one service ?
This issue is really a blocker. It is so random that its hard to reproduce.

  • What are the status code and the error message of the failed prompt request?
  • Can you send me the server container logs when this happens in case I do see something interesting? You can use the message feature here.
  • Are you able to share with me an example of a workflow that hangs?
  • Are you able to create a new workflow (as small as possible) that would hang like this?

If it hanged due to an error, then yes, it should.

No. The Corteza Server container is as specific as you can go.

The request times out.

Please check

Thing is I am not sure which workflow hangs and when. More or less most of them has hanged atleast once

Uhh… strange… well I’ll be addressing some workflow related issues today and I’ll hopefully stumble on this one as well.

If not, I’d ask if you can continue to try and get a set of steps to reproduce this (preferably on a fresh Corteza install – no custom modifications).

1 Like

So yesterday we did some changes to our workflows. So previously we had multiple workflows for same module triggers, not sure if it is a bad practise or not. Anyways yesterday we changed all those to single worflows per triggers. So far it seems to be stable not sure if it got fixed, I am still testing.

We found out that it’s way easier to handle fewer bigger automations rather than multiple smaller automations. Either should work, but having fewer automations makes it easier to debug issues.
You can use swimlanes to indicate what does what. We will also be adding more support for reusability and nicer workflow organization in the future.

I was able to run into something that looks like what you’ve described. Some strange state caused the automation system to lock up which prevented the prompts request to complete, but the rest was working fine.

I will be investigating this more in the following days.

1 Like

The system got hanged again so as you said it doesnt matter if workflows are big small. I have the logs.

Hopefully it can be resolved soon.

@tjerman Hi! Are you able to reproduce this issue? Any update on the fix? Thanks.

Well, I was able to get it not working the last time I posted here, but when I went to try and debug it, I was just not able…

I did find a few potential causes for this (charms of parallel execution), which should be included in the next patch release (2021.3.8).

Were you able to uncover anything on a fresh install?

1 Like

I did fresh install and created few workflows, but the issue cannot be constanly reproducible.

I will update you if I can get it constantly reproducible. I have one workflow that always fails but it doesnt hangs the whole automation, I will create a fresh install and test that workflow and update you.

@sourav.mukherjee we found and resolved a few things related to this issue.
The patch will be released later today (2021.3.9) so if you could confirm that this has resolved your issues as well.

1 Like

I will surely check and inform back after testing. Thanks

The update was released yesterday evening in case you’ve missed it; Add 2021.3.9 changelog · cortezaproject/corteza-docs@0bead1f · GitHub

The workflows did`nt hang so far… I will also test this in new new release 2021.3.10