Jump to content

Server stops responding


Buckie

This topic is 2685 days old. Please don't post here. Open a new topic instead.

Recommended Posts

All right guys, this might have been discussed already somewhere but I cannot find it so sorry for a potential duplicate.

 

I'm trying to get to the root of the problem here. At random times a perfectly regular FM Server v10 running on OS X 10.5 with all the latest updates installed would stop responding and serving clients. Attempts to connect will time out in about 30 seconds to a minute. Admin tools (such as fmsadmin) stop working and wait forever for the server to reply to a command. Nothing crashes, all the processes appear to be running, nothing in the logs, fmserverd keeps the port open and I can connect to the port via telnet or nc but FM Pro clients cannot connect at all, not even able to list files. Restarting the machine helps. Actually the logs do show everything to work fine because when the machine is restarted in that state graceful closing of all databases and stopping FM Server is being noted in the logs, so it doesn't appear to be "stuck".

I used to get the same kind of behavior on an unsupported version of OS X 10.8 and I assumed since it was unsupported the problems had to be expected. However later I migrated to a different machine running OS X 10.5 and still get exactly the same problem from time to time.

Any advice?

Link to comment
Share on other sites

Since it happens on two machines; what other software do you have installed that was also on the other machine?  If you find no evidence of crashes in the FMS and system logs then it could be something else that is running interference.

Link to comment
Share on other sites

Thanks for replying, Wim.

The truth is that those two machines are very different and the one with OS X 10.5 was specifically set up from ground up to be free from other software. It's a 10.5 OS X Server that is running DNS (secondary) and Mail services, both very low volume and there's only one user configured on the system for Mail specifically. Nothing else, no Open Directories, no Time Machines, and definitely no third-party software and nothing has been updated manually, only Software Update has been used. The only serious change I made to the FM config was telling it to use the maximum possible amount of RAM. So it's really puzzling.

I suspect it may happen during daily (nightly in my case) backups but it's very hard to prove since it may very well take a month for the problem to happen again unexpectedly and again there's nothing in the logs to suggest it really is the case but usually people can't get to their databases the next morning.

Second suspicion is that there might have been a Java update and FM Server doesn't like that but it shouldn't really affect the server portion only the admin panel.

I only wish there was a way to dig deeper and get at least some info from the FM Server running in that semi-available state, but I just don't know how. At least if the ports became silent I could write a simple script but that's not the case.

Link to comment
Share on other sites

 

I suspect it may happen during daily (nightly in my case) backups 

 

 

Well, you just stated that there was no 3rd party software running and no Time Machine, so what is doing the backup!?

External backups are always suspect

 

Second suspicion is that there might have been a Java update and FM Server doesn't like that but it shouldn't really affect the server portion only the admin panel.

 

 

Wrong.  There is a lot of java going on in FMS, not just the admin console.  So don't update anything on the machine beyond the FMS requirements.

If you have to update because of the other roles the machine has then this is a typical case of why you'd want to dedicate a machine to FMS if that role is important enough...

 

I would double and triple-check the FMS logs and the various OS logs to look for any signs of trouble.  I don't think I have ever seen an FMS deployment display these kinds of issues without a trace somewhere.

I realize that this kind of sleuthing can and will take up many hours, but there it is.

Link to comment
Share on other sites

By backups I meant the regular scheduled database backups performed by Filemaker Server itself.

 

I already did check out all the logs, including the crash logs... nothing in there at all. It used to crash before on another machine, at least that meant something. Now it's all just silence, the strange part is that it doesn't really crash, it doesn't even stop logging, it only stops servicing new connections on its port which remains open.

 

Do you happen to know if it's possible to painlessly downgrade Java on OSX or if it's going to require a complete OS reinstall?

Is there an option to make FM logs really verbose?

Link to comment
Share on other sites

  • 2 years later...

This is probably irrelevant to everybody at this point in time, however I did finally manage to isolate the circumstances under which the FileMaker Server version 10 on a Mac may stop responding, also essentially opening a doorway to a very simple-to-execute DOS attack on said server.

Due to my internet provider misconfiguration or maybe a deliberate configuration change somewhere I started seeing a problem where if you don't interact with the server/database for 15 seconds the connection would go dead, without breaking via normal TCP mechanisms. It's just all subsequent attempts to send something down that pipe yield nothing and you must re-establish the connection again. I'm still trying to battle that particular issue. During testing however it led to unexpected discovery. So,

1. Establish a connection to the server using telnet. Don't send anything, just let it sit like that.

2. Force the connection to physically break so you don't terminate it using the normal TCP protocol (unplugging the cable works, for example) and close your telnet session/client.

That's it! We're done! It won't break any existing and established connections that continue to work, however nobody will be able to establish any new ones and this condition will persist indefinitely, there's no timeout. Any attempts to stop/restart the filemaker server with `fmsadmin server stop` will fail and the only way to return to normal would be to manually kill filemaker server processes or better yet reboot.

One interesting thing and probably why it eluded the QA is that that only happens if you send nothing during the telnet session, essentially just a establishing a TCP connection and nothing else. If it is a regular FM Pro client that successfully connects and then abruptly disconnects in the middle of doing something, nothing bad would happen.

It doesn't happen on FMS 15/Mac and I haven't tested either version on Windows. It'd be very interesting to know if the owners of Server version 11 suffer from the same problem, if anyone's willing to test their installs.

Link to comment
Share on other sites

This topic is 2685 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.