Status code 504 in workflow

Hi!

I am having trouble in my workflow. This workflow calls an API, iterates through a list and obtains data to call another API, then collects the data in the second list and saves them in a new record. This is how the workflow looks like.

When I run the workflow, the records are created and saved, but then the workflow has an error at some point. “Request failed with status code 504”

The errors also does not occur at the same time, sometimes after 170 records, sometimes after 180 records.

The first list has around 30 items and the second list has an average of around 20 items.
9 fields are obtained and saved into each new record.
I am not sure what the issue is here.

Thank you very much for the help!

check your infrastructure? this is timeout from the server
so you could configure the server to higher number

Do you what configuration exactly should be changed? I tried to edit some of the time_out fields in the system configuration from Server configuration :: Corteza Docs , but I am still having the same problem.

From the documentation, I saw that workflows should run indefinitely, is there a timeout when calling external API?

Thank you for the help!

@nutella 504 is HTTP error not related to Corteza
check your server’s configuration (or your loadbalancers) to increase the timeout

1 Like

The configurations for the timeout on the server seem to be fine, but the error still occurs. Is it possible the error is caused by the API calling?

If you call some external service then yes; if that request errors out then the workflow would also.
If that is the case, you can use error handlers to catch errors.

I did some testing with the workflow and the external APIs are working perfectly, when I decrease the size of the payload/arrays, the data is able to be saved in the records.

If status code 504 is not related to Corteza, does it mean there is a time out configuration that I must have missed that is stopping the workflow?

@nutella
how many minutes does your server to timeout?

I encountered something like this before, my LoadBlancer timeout configuration was 3 minutes I increased it to 15 minutes.

to easily find out what’s your infrastructure timeout find/create a lengthy workflow (iterate over a lot of records) then run that workflow manually while you in the page and see how many minutes to get 504 status code

I think you better contact someone who has knowledge with hosting your servers

1 Like

Hi @munawir ,

I got a lot of 504 on my server with a workflow. It crashes corteza-server and all associated services (corredor, DB, etc) become instable. It’s like when something start to be crazy, all others service become the same.
I get load on server and the only way is to stop/restart docker.

It looks corteza-server has bad DB/query/workflow management with lot of data and looks totally instable… I’m on nginx/percona with version 2021.9.8.

How did u solved that? Just with increased to 15 minutes ?.. did you get like me, 504 generated and then you can do nothing?

If you have 1000 records to add as a loop, it’s incredible to wait for me so long time.
As bad user experience we are on the top… :slight_smile:

Thanks !

I think I got your issue

sometimes when I hit the server too many times with heavy automation/rest API requests … it suddenly lags and doesn’t perform any write operation till I restart the docker/server

@tjerman what do you think is the reason ?

yes it’s for me something very critical on corteza… it’s crashing totally server… (I mean corteza-server) and the only way is to restart docker server (if you can else sometimes you’ve to restart server due to too much load/memory problem…

that’s a very critical point. on my side I consider Corteza cannot be used on production with that problem… it’s working for small things but not in real prod env…

Hi @Mike @munawir @nutella

thanks for the report, we are aware of a potential data race in the workflow/API gateway subsystem and are trying our best to weed the bugs out.

Your reports definitely help, so I would ask you if you have any more info under what kind of circumstances those issues happen.

@nutella , thanks for the workflow example, that definitely helps and once we get some more info on it, we’ll post the update here.
Do you have issues with the payload size on HTTP request?
Do the issues start with the second iterator?
Is the size of the response an issue?
Is the amount of items on one of the iterators an issue?

I will check but the more info we get, the easier (not easy :)) it gets to fix these kinds of issues, thanks.

Cheers, Peter

2 Likes

Hey @peter
I can help to reproduce the issue … and I’ll write more scenarios when I get back to the office

1 Like