Recovery as Pre-emptive Diagnostic Tool, aka Red Herrings or Canaries?

jayivan · January 15, 2014

How useful is Recovery (in FileMaker Pro 11 and FileMaker Pro 13) as a pre-emptive diagnostic tool? If problems are fixed and reported, even if no symptoms of any problems are apparent, would you recommend alerting these clients to a need to rebuild those solutions from scratch and begin that process? Or…since there are no data or performance issues manifesting themselves in the live files, would you recommend just proceeding as is, understanding that there is the possibility of corruption later?

Here's some context...

I ran recovery on a number of separate client solutions, and many came back as having had a few structure or schema items modified (under 10) and no bad blocks. FileMaker recommends not using these recovered files. The original files themselves have displayed no problems whatsoever--recovery was done to see what the results would be. All backups, when recovered, show the same recovery modifications made. These solutions have been in use for years, and even the original files saved before deployment contain these errors--in other words, no original which Recovery doesn't modify. Any recovered library 'blobs' are pulled from the Home table in these solutions, which store settings like background colors and logos.

On practical terms, it is hard to justify to the client the investment in fixing problems which don't exist in a practical way…The question is, do these problem really in fact exist and the client is just waiting to uncover them. Are the Recovery tools acting like a canary in the coal mine saving us from some suffocating data loss down the road, or are these "fixes" red herrings sending me upstream for weeks of unnecessary work?

Lee Smith · January 15, 2014

Hi Jay, and welcome to the FMForums,

There have been several threads in the past regarding this subject. I just did this Google Search for site: fmforums.com Recovery site:fmforums.com and received several hits on this topic.

HTH

Lee

Wim Decorte · January 16, 2014

Running recover on a backup is a good diagnostic tool. As to telling the client the solution has to be re-written: depends. No good single answer there. At the very least the client needs to be informed because it is a risk to their solution.

The one crucial thing that you are not mentioning is: fixing the deployment so that that damage does not happen again. There is no point i doing a rewrite if the damage will keep happening because of a faulty deployment that led to the damage in the first place.

jayivan · January 16, 2014

Thank you Lee for the additional links and Wim for your perspective. Let me clarify my question, which I haven't found addressed in the other posts I've read. The posts I've read concerned damaged files that were recovered. I'm hoping to gather more information about pro-active use of recovery where files do not exhibit any signs of damage.

Could pro-active use of Recovery show a problem where no significant problem exists? Could Recovery show a problem that should not be a cause for alarm, and which does not suggest either using a recovered file or reverting/rebuilding?

Some diagnostic tools are perhaps overzealous. Among my drive technical tools, I use DiskWarrior nearly exclusively--the scope of what it does is limited, it makes no repairs before reviewing changes with the user, and has not falsely identified any issues in my experience. I no longer use Norton or Tech Tool on my Macs--these tools have a broader scope and I've experienced a few repairs (of innocuous issues) which lead to decreased stability.

I've gathered from developers and FileMaker documentation that FileMaker Recovery can be a bit brutish in its recovery efforts, which is why the use of recovered files is discouraged by some developers, and very strongly discouraged by FileMaker. In light of this, could using Recovery as a diagnostics tool be identifying relatively innocuous problems?

Here's one scenario resulting from a pro-active use of Recovery: a file has never displayed any working problems, but Recovery identifies a number of bad blocks as well as many modifications to the schema. This seems to be a stronger case to either rebuild the file, or identify a backup without these when recovered.

Here's another scenario: a file has never displayed any working problems, but Recovery lists that 2 modifications made to the file structure without any block or schema changes. Perhaps in this case, the original file continues to be used and the file recovered for diagnostics purposes is deleted. We wouldn't want to use the recovered file, and the original file is healthy enough to continue using, especially since there's no evidence of a problem other than what Recovery informed.

Are certain problems indeed more significant than others, certain types of problems just red herrings, or is any problem identified by Recovery a sure sign to rollback/rebuild?

Wim Decorte · January 16, 2014

You're asking for an absolute blessing and there is none.

FMI has never revealed how it determines damage to a file so the only safe way is to treat reported damage as BAD. There is no way we can score damage on a scale.

ANY damage has to be taken seriously and the deployment has to be scrutinized. In my mind that's the big thing that you have not acknowledged yet. Damage does not happen out of the blue, something is causing it. So instead of trying to figure out how bad the damage is and how safe it would be to continue to use damaged files, I would concentrate on what could causes it and rectify that to prevent ongoing damage.

David Jondreau · January 16, 2014

If the file isn't displaying any problems I would just keep using it.

jayivan · January 20, 2014

Wim, while not looking for an absolute blessings, I was hoping one of us would have more information on the relative severity of block vs schema vs structure changes. Without that, there is surely no way to determine THE most prudent course of action, other than to acknowledge there is an underlying problem to the file, even if there is no apparent problem in day to day operations or backup consistency checks. Perhaps that is the reason for the lack of grey area among developers in response to diagnostic recovery modifications: like David, many suggest to push on until there's a reason to implement a rebuild; others (and I think you may be in this camp), suggest damage is damage and a rebuild prudent.

Since our clients are currently connecting to up-to-date FileMaker Servers with UPS protections, the only certainties seem to be the need to save copies of files following all major iterations, and to run Recovery as a diagnostic tool following any fms crash or hardware failure (and then revert to backups should any previously ok files now report modifications). Facing forward is clear. I guess we'll have to wait for FileMaker to provide a better diagnostic tool to give the best advice for inherited/legacy files still in use.

Wim Decorte · January 20, 2014

others (and I think you may be in this camp), suggest damage is damage and a rebuild prudent.

No, I realize that a rebuild is not always feasible.

Two things need to happen when damage is found:

1) It has to be documented and the client needs to be informed. This is not the sort of thing you want to try and hide. Properly informed, the customer can make the decision on a rebuild or not

2) the server deployment needs to be scrutinized for all usual suspects that are known to damage files. Hopefully further damage can be avoided that way.

the only certainties seem to be the need to save copies of files following all major iterations, and to run Recovery as a diagnostic tool following any fms crash or hardware failure (and then revert to backups should any previously ok files now report modifications).

Again, what I am missing here is a periodic review of the server deployment for adherence to best practices and looking for things that can damage files. Files are not hosted in a vacuum and damage does not happen magically.

LaRetta · January 20, 2014

the only certainties seem to be the need to save copies of files following all major iterations, and to run Recovery as a diagnostic tool following any fms crash or hardware failure (and then revert to backups should any previously ok files now report modifications).

Regarding portion I highlighted in red and bold ... any file which crashes should have its data recovered and imported into backup even if the log says no problems were found. Always. Each time a file crashes, it can cause damage particularly if a record was in middle of modification and the log does not always report it You may not see it for days or months until the record is next modified (if it ever is) which can crash your file again and on it goes. Then one day it simply will not open.

If you always go to the backup you at least know that the current crash did not damage further and add to the file's existing problems.

And yes, sometimes we must hold our breath and keep working in damaged files until new ones can be built. And when we finally get a chance to create a new solution from start, be SURE to protect it with everything you've got which is precisely Wim's point. Disaster Recovery is the most ignored (and most important) aspects of any solution and that starts with prime prevention.

Added to sentence the blue

Edited January 20, 2014 by LaRetta

Rick Whitelaw · January 20, 2014

I have a tale of corruption that I've told before, but I'll tell it again. I have a file which is central to my business solution. With data it's about 15mb. A clone is around 5mb. A few years ago I began running Recover occasionally to test files. None of the files had exhibited any problems whatsoever. On this file Recover reported schematic changes. It created one table "Recovered" and two fields in two tables (also "Recovered"). I had backups going back nearly to the very beginning of this file's existence. Even when the file was tiny, Recover reported the same problems. Exactly the same problems. So . . . I opened the recovered file and removed the offending "Recover" objects. THAT file comes up clean using Recover. However tempting it may be to use the recovered file (minus the spurious table and fields), I have resisted and have continued using the same file even though FMPA tells me the repaired recovered file is safe to use. I've heard too many warnings about using even a "clean" recovered file to consider using one. One cold comfort I suppose is that should the file ever exhibit funky behavior I could try the workaround I described (on a backup) before resorting to a rebuild and perhaps it would work. Almost everything about FMP seems transparent. The Recover process and the resulting files remain a mystery to most users and developers. In a sense the original post about using Recover as a preemptive strategy is bang on. In the end what other use has it? It can tell you a file isn't safe to use and there's a log, and it can tell you a file is fine. The one exception I see is that if a file IS exhibiting problems, Recover can tell you something's wrong, and I suppose that's really why it exists.

My two cents,

Rick.

jayivan · January 20, 2014

when we finally get a chance to create a new solution from start, be SURE to protect it with everything you've got which is precisely Wim's point.

Thank you both for your continued input. Here's some things done for protection: OS X set to not sleep, OS X hard drive set to NOT spin down, backups run hourly, server power connected to a UPS, server process always shutdown first prior to OS updates or restarts, the machine has a healthy amount of RAM, the boot volume is a single drive (and not a stripe or live mirror). I gather I should begin reverting to backups following any fms crash, kernel panic, or any other kind of non-graceful shutdown of the hosted files. Could you suggest additional protection measures, or suggest a link which lists this out? The searches I've run so far have not suggested much more than that, but I'm sensing I may be missing something.

Would you have suggested method for uploading the hosted file from my development workstation to the server itself? I've noticed that sometimes a Finder based copy of the closed file (or a zipped copy of the closed file) will lead to Recovery modifications on the server workstation. This behavior is inconsistent though.

jayivan · January 21, 2014

Just an update. I've been reviewing the recovery logs in more detail with the goal of correcting problems before they get worse, and noticed the structural modifications Recovery and records in its logs, could be addressed in the active file.

For example, the recovery log may show:

2014-##-## 12:29:43.862 -0500 MYFILE.fmp12 0 Recovering: field 'myfield' (33)

2014-##-## 12:29:43.863 -0500 MYFILE.fmp12 8493 Resetting invalid table key

2014-##-## 12:29:43.864 -0500 MYFILE.fmp12 8476 This item changed

If I create a new field to store the contents of 'myfield', copy the data over, and then delete 'myfield', recovery comes back clean. Using Recovery as a diagnostic has proved useful in this sense. I have not tried this approach in multiple scenarios, but it has been effective for the one client where I've tried this approach. The recovered files have been trashed, and the live files appear to get a green light from the Recovery process.

If any of you suspect this approach has hidden dangers, please let me know.

Rick Whitelaw · January 21, 2014

In the example "my field" isn't in itself triggering a problem. Recover "recovers" everything in a file.

LaRetta · January 21, 2014

If I create a new field to store the contents of 'myfield', copy the data over, and then delete 'myfield', recovery comes back clean. Using Recovery as a diagnostic has proved useful in this sense. I have not tried this approach in multiple scenarios, but it has been effective for the one client where I've tried this approach. The recovered files have been trashed, and the live files appear to get a green light from the Recovery process. If any of you suspect this approach has hidden dangers, please let me know.

Is it safer to use it after you've deleted the field? Maybe - nobody knows - it may worsen the problem because even if you correct that issue and FM says it is fine, it may have other hidden or collateral damage. Of course there are many solutions which exhibit corruption which run for years. Usually it is because the damage is in an older record or portion of the system which is not used regularly and it just hasn't been accessed recently. It is a risk. :crazy2:

It can also provide a false sense of security if it appears to work this time (no crashes since/yet) and next time it happens in a different file, you'll say, "well I'll try this because it worked last time" ...

I suggest that you continue to focus on what Wim said about scrutinising server process and preventative measures so you do not have to be in this position in future. BTW, every day after I work in design (ONLY stand-alone never LAN or WAN) while never letting the open file out of my sight so I know it has not experienced an issue, I run Recover on the files and if it indicates damage then I go to the back up (which I create every hour). I would rather lose an hours work than live with a damaged file forever or until it is rebuilt. The master is always used and kept pristine. I reached this OBSESSIVE position because of going through what you are going through and eight years ago Wim told me the same things; back then I did everything wrong (and I still do my fair share, mainly talking too much). :laugh2:

As for the description of your setup and preventative measures, a deeper detailed analysis and other deployment factors come into play. Maybe Wim can make additional suggestions; I am no expert on it and it is a deep subject. Make sure anti-virus does not run on server folders nor that sharing is on to them. There are thousands of posts on the subject, most of which he or Steven Blackwell (the other top authority on server) have responded.

Wim has a great training series here: http://www.vtc.com/products/FileMaker-Server-12-Tutorials.htm.

I know it can be frustrating to encounter a situation with no absolutes but the recover process is one of them. A file once damaged is forever untrustworthy so focusing on prevention is the real key. And if it crashes again, is it because of old damage or is something new happening with your deployment that needs your attention? You will never know.

Sign In

Recovery as Pre-emptive Diagnostic Tool, aka Red Herrings or Canaries?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Similar Content

Important Information